A Memory-efficient Store for SILT: Part III
Learn how to design a memory-efficient key-value store.
Bulk merge of intermediary stores with memory-efficient store
There needs to be a configurable number of intermediary stores accumulated. Once we have the required number then we can bulk merge the intermediary stores and the current instance of the memory-efficient store.
The intermediary store maintains values in hash order and maintains sequential read access. Merging these with our sorted memory-efficient store is efficient since all values are sorted. The merge process occurs in two steps listed below:
Sorting entries in intermediary stores
Sequentially merging sorted intermediary stores with the memory-efficient store
The first step in our bulk merge is sorting all intermediary stores that need merging. We do this by enumerating every entry in the intermediary stores and storing them in sorted key order. In the end we have all the entries from our intermediary stores sorted and stored together.
Next, we will choose the newest valid entries between our sorted intermediary store entries and our memory-efficient store entries. This is a sequential merge. It means that we will read sequentially from our merged and sorted intermediary store entries, and our memory-efficient store entries. We will have two pointers for this. At the start of our merge, one of these will point to the first entry of the memory-efficient store entries and the second one to the intermediary store entries in storage. At any point, we will refer to the key pointed to by the first pointer as Km
(a key in the memory-efficient entries). We will refer to the key pointed to by the second pointer as Ki
(a key in the intermediary store entries). We will construct a new entry log in storage based on the following steps:
If
Km
is lesser thanKi
, we will copy theKm
entry to our new log and move the pointer in the memory-efficient store entries by one entry. It points to anotherKm
.If
Km
is greater thanKi
, and theKi
entry is not aDELETE
request, we will copy theKi
entry to our new log and move the pointer in the intermediary store entries by one entry. It points to anotherKi
.If
Km
is greater thanKi
, and theKi
entry is aDELETE
request, we will move the pointer in the intermediary store entries by one entry. It points to anotherKi
.If
Km
equalsKi
, and theKi
entry is not aDELETE
request, we will copy theKi
entry to our new log and move the pointers in both memory-efficient and intermediary store entries by one entry. These point to anotherKm
andKi
.If
Km
equalsKi
, and theKi
entry is aDELETE
request, move the pointers in both memory-efficient and intermediary store entries by one entry. These point to anotherKm
andKi
.
If the intermediary store pointer reaches the end of its log before the memory-efficient store pointer, we will copy the remaining entries in the memory-efficient store to the new log until its pointer reaches the end of the log. If the memory-efficient store pointer reaches the end of its log before the intermediary store pointer, then we will copy all remaining requests excluding DELETE
requests to the new log until the pointer reaches the end of the log.
Note: A key's entry is a
DELETE
request if it contains the specialDELETE
character.
Here’s a visual representation of our bulk merge.
Level up your interview prep. Join Educative to access 80+ hands-on prep courses.