Discussion on Random Binary Search Trees

Discover more aspects regarding random binary search trees.

We'll cover the following

Additional notes

Random binary search trees have been studied extensively. DevroyeL. Devroye. Applications of the theory of records in the study of random trees. Acta Informatica, 26(1):123–130, 1988. gives a proof of the lemmaThe lemma is about the length of search path in a binary search tree. and related results. There are much stronger results in the literature as well, the most impressive of which is due to ReedB. Reed. The height of a random binary search tree. Journal of the ACM, 50(3):306–332, 2003., who shows that the expected height of a random binary search tree is

αlnnβlnlnn+O(1)\alpha\ln n - \beta\ln\ln n + O(1)

where a4.31107a \approx 4.31107 is the unique solution on the interval [2,)[2,\infty) of the equation αln((2e/α))=1\alpha\ln((2e/\alpha)) = 1 and β=32ln(α/2).\beta = \frac{3}{2 \ln(\alpha/2)} . Furthermore, the variance of the height is constant.

The name Treap was coined by Seidel and AragonR. Seidel and C. Aragon. Randomized search trees. Algorithmica, 16(4):464–497, 1996. who discussed Treaps and some of their variants. However, their basic structure was studied much earlier by VuilleminJ. Vuillemin. A unifying look at data structures. Communications of the ACM, 23(4):229–239, 1980. who called them Cartesian trees.

One possible space-optimization of the Treap data structure is the elimination of the explicit storage of the priority p in each node. Instead, the priority of a node, u, is computed by hashing u’s address in memory. Although a number of hash functions will probably work well for this in practice, for the important parts of the proof of Lemma 7.1 to remain valid, the hash function should be randomized and have the min-wise independent property: For any distinct values x1,...,xkx_1,..., x_k , each of the hash values h(x1),...,h(xk)h(x_1),..., h(x_k ) should be distinct with high probability and, for each i{1,...,k}i \in \left\{1,...,k\right\},

Pr{h(xi)=min{h(x1),,h(xk)}}c/k\Pr\left\{h(x_i)=\min\{h(x_1),\ldots,h(x_k)\}\right\}\leq c/k

for some constant cc. One such class of hash functions that is easy to implement and fairly fast is tabulation hashing.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy