An estimate about -trees is that
- In the external memory model, the running time of
find(x)
,add(x)
, andremove(x)
in a B-tree is . - In the word-RAM model, the running time of
find(x)
is and the running time ofadd(x)
andremove(x)
is .
The following lemma shows that, so far, we have overestimated the number of merge and split operations performed by B-trees.
Lemma 1: Starting with an empty -tree and performing any sequence
of m add(x)
and remove(x)
operations results in at most splits, merges,
and borrows being performed.
Proof: The proof of this has already been sketched for the special case in which . The lemma can be proven using a credit scheme, in which
- each
split()
,merge()
, orborrow()
operation is paid for with two credits, i.e., a credit is removed each time one of these operations occurs; and - at most three credits are created during any
add(x)
orremove(x)
operation.
Since at most credits are ever created and each split, merge, and borrow is paid for with with two credits, it follows that at most splits, merges, and borrows are performed. These credits are illustrated using the symbol.
To keep track of these credits the proof maintains the following credit
invariant: Any non-root node with keys stores one credit and any
node with keys stores three credits. A node that stores at least keys and most keys need not store any credits. What remains is to show that we can maintain the credit invariant and satisfy properties
and , above, during each add(x)
and remove(x)
operation.
Adding
The add(x)
method does not perform any merges or borrows, so
we need only consider split operations that occur as a result of calls to
add(x)
.
Each split operation occurs because a key is added to a node, , that
already contains keys. When this happens, u
is split into two nodes,
and having and keys, respectively. Prior to this operation, u
was storing 2B − 1
keys, and hence three credits. Two of these credits can be used to pay for the split and the other credit can be given to
(which has keys) to maintain the credit invariant. Therefore, we can pay for the split and maintain the credit invariant during any split.
The only other modification to nodes that occur during an add(x)
operation happens after all splits, if any, are complete. This modification involves adding a new key to some node . If, prior to this, had children, then it now has children and must therefore receive three credits. These are the only credits given out by the add(x)
method.
Removing
During a call to remove(x)
, zero or more merges occur and are possibly followed by a single borrow. Each merge occurs because two nodes, v
and w
, each of which had exactly B − 1
keys prior to calling remove(x)
were merged into a single node with exactly 2B − 2
keys. Each such merge therefore frees up two credits that can be used to pay for the merge.
After any merges are performed, at most one borrow operation occurs, after which no further merges or borrows occur. This borrow operation only occurs if we remove a key from a leaf, v
, that has B − 1
keys. The node v
therefore has one credit, and this credit goes towards the cost of the borrow. This single credit is not enough to pay for the borrow, so we create one credit to complete the payment.
At this point, we have created one credit and we still need to show that the credit invariant can be maintained. In the worst case, v
’s sibling, w
, has exactly keys before the borrow so that, afterwards, both v
and have keys. This means that v
and w
each should be storing a credit when the operation is complete. Therefore, in this case, we create an additional two credits to give to v
and w
. Since a borrow happens at most once during a remove(x)
operation, this means that we create at most three credits, as required.
If the remove(x)
operation does not include a borrow operation, this
is because it finishes by removing a key from some node that, prior to the
operation, had or more keys. In the worst case, this node had exactly
keys, so that it now has B−1
keys and must be given one credit, which we create.
In either case—whether the removal finishes with a borrow operation
or not—at most three credits need to be created during a call to remove(x)
to maintain the credit invariant and pay for all borrows and merges that occur. This completes the proof of the lemma.
The purpose of Lemma 1 is to show that, in the word-RAM model the cost of splits, merges and joins during a sequence of add(x)
and remove(x)
operations is only . That is, the amortized cost per operation is only , so the amortized cost of add(x)
and remove(x)
in the word-RAM model is . This is summarized by the following pair of theorems:
Theorem 1: (External Memory -Trees). A BTree
implements the SSet
interface. In the external memory model, a BTree
supports the operations
add(x)
, remove(x)
, and find(x)
in time per operation.
Theorem 2: (Word-RAM -Trees). A BTree
implements the SSet
interface. In the word-RAM model, and ignoring the cost of splits, merges, and borrows, a BTree
supports the operations add(x)
, remove(x)
, and find(x)
in time per operation. Furthermore, beginning with an empty BTree
, any sequence of add(x)
and remove(x)
operations results in a total of time spent performing splits, merges, and borrows.
Create a free account to access the full course.
By signing up, you agree to Educative's Terms of Service and Privacy Policy