A B-Tree is a self-balancing m-way tree data structure that allows searches, accesses, insertions, and deletions in logarithmic time. Each node in a B-Tree of order m can have, at most, m children and m-1 keys.
Think of B-Tree as a generalization of a binary search tree (BST). Similar to a BST, the stored data is sorted in a B-Tree, but unlike a BST, a B-Tree can have more than two child nodes.
In addition to having multiple children, each node in a B-Tree can have more than one key, which keeps the height of the tree relatively small. Later on, we will see how keeping multiple keys in a single node helps with data locality.
In summary, a B-Tree of the order m has the following properties:
The time complexity for insertion, deletion, and search operations takes time where is the number of keys stored in the tree.
The space complexity for a B-Tree is , where is the number of keys in the tree.
B-Tree is primarily used to store data on disk because its operations (e.g., insert and search) require relatively few disk reads.
Often the size of data on the disk is large and cannot fit entirely in main memory. Hence, data is read from a disk in contiguous chunks or blocks.
However, reading from a disk is costly as disk access times are far higher than memory access times.
B-Trees have a high branching factor, meaning the trees are fat with a relatively small height, which ensures minimal disk reads to navigate to the place where the data is stored.
B-Trees are, therefore, well-suited for storage systems that read and write large blocks of data.