Search⌘ K

Searching and Addition in B-Tree

Understand the recursive implementation of searching and addition in B-trees. Learn how find and add operations navigate and modify nodes with efficient binary searches, node splitting, and balancing techniques to maintain performance in external memory models. This lesson explains the algorithms behind these operations and their time complexities.

Searching

The implementation of the find(x) operation, which is illustrated below generalizes the find(x) operation in a binary search tree. The search for x starts at the root and uses the keys stored at a node, u, to determine in which of u’s children the search should continue.

A successful search is shown below (for the value 44) and an unsuccessful search (for the value 16.516.5) in a B-tree. Shaded nodes show where the value of zz is updated during the searches.

Searching in B-tree
Searching in B-tree

More specifically, at a node u, the search checks if x is stored in u.keys. If so, x has been found and the search is complete. Otherwise, the search finds the smallest integer, i, such that u.keys[i]>x\text{u.keys}[i] > x and continues the search in the subtree rooted at u.children[i]. If no key in u.keys is greater than x, then the search continues in u’s rightmost child. Just like binary search trees, the algorithm keeps track of the most recently seen key, z that is larger than x. In case xx is not found, z is returned as the smallest value that is greater or equal to x

C++
T find(T x) {
T z = null;
int ui = ri;
while (ui >= 0) {
BTree
Node * u = bs.readBlock(ui);
int i = findIt(u -> keys, x);
if (i < 0) return u -> keys[-(i + 1)]; // found it
if (u -> keys[i] != null)
z = u -> keys[i];
ui = u -> children[i];
}
return z;
}

Central to the find(x) method is the findIt(a,x) method that searches in a null-padded sorted array, a, for the value x.

The execution of findIt(a,27)
The execution of findIt(a,27)

This method, illustrated in above illustration, works for any array, a, where a[0],...,a[k1]a[0],...,a[k −1] is a sequence of keys in sorted order and a[k],...,a[a.length1]a[k],...,a[\text{a.length}−1] are all set to null. If x is in the array at position i, then findIt(a,x) returns −i − 1. Otherwise, it returns the smallest index, i, such that a[i] > x or a[i] = null.

C++
int findIt(T[] a, T x) {
int lo = 0, hi = a.length;
while (hi != lo) {
int m = (hi + lo) / 2;
int cmp = a[m] == null ? -1 : compare(x, a[m]);
if (cmp < 0)
hi = m; // look in first half
else if (cmp > 0)
lo = m + 1; // look in second half
else
return -m - 1; // found it
}
return lo;
}

The findIt(a,x) method uses a binary search that halves the search space at each step, so it runs in O(log(a.length))O(\log(\text{a.length})) time. In our setting, a.length = 2B, so findIt(a,x) runs in O ...