LinearHashTable: Linear Probing
Learn to implement a LinearHashTable.
Open addressing
The ChainedHashTable
data structure uses an array of lists, where the
th list stores all elements x
such that hash(x) = i
. An alternative, called open addressing, is to store the elements directly in an array, t
, with each array location in t
storing at most one value. This approach is taken by the LinearHashTable
described in this lesson. In some places, this data
structure is described as open addressing with linear probing.
Structure of LinearHashTable
The main idea behind a LinearHashTable
is that we would, ideally, like to store the element x
with hash value i = hash(x)
in the table location t[i]
. If we cannot do this (because some element is already stored there) then we try to store it at location t[(i + 1) mod t.length]
; if that’s not possible, then we try t[(i+ 2) mod t.length]
, and so on, until we find a place for x
.
There are three types of entries stored in t
:
- Data values: Actual values in the USet that we are representing.
null
values: Stored at array locations where no data has ever been stored.del
values: Stored at array locations where data was once stored but that has since been deleted.
In addition to the counter, n
, that keeps track of the number of elements
in the LinearHashTable
, a counter, q
, keeps track of the number of elements of Types 1 and 3. That is, q
is equal to n
plus the number of del
values in t
. To make this work efficiently, we need t to be considerably
larger than q
, so that there are lots of null values in t
. The operations on
a LinearHashTable
therefore maintain the invariant that .
To summarize, a LinearHashTable
contains an array, t
, that stores
data elements, and integers n
and q
that keep track of the number of
data elements and non-null values of t
, respectively. Because many hash
functions only work for table sizes that are a power of , we also keep an
integer d
and maintain the invariant that
T[] t; // the tableint n; // the sizeint d; // t.length = 2ˆdint q; // number of non-null entries in t
The LinearHashTable
operations
The find(x)
operation in a LinearHashTable
is simple. We start at
array entry t[i]
where i = hash(x)
and search entries , , and so on, until we find an index i
such that, either, or null
. In the former case we return . In the latter case, we conclude that x
is not contained in the hash table and return null
.
T find(T x) {int i = hash(x);while (t[i] != null) {if (t[i] != del && x.equals(t[i])) return t[i];i = (i == t.length - 1) ? 0 : i + 1; // increment i}return null;}
The add(x)
operation is also fairly easy to implement. After checking
that x
is not already stored in the table (using find(x)
), we search and so on, until we find a null
or del
and store x
at that location, increment n
, and q
, if appropriate.
boolean add(T x) {if (find(x) != null) return false;if (2 * (q + 1) > t.length) resize(); // max 50% occupancyint i = hash(x);while (t[i] != null && t[i] != del)i = (i == t.length - 1) ? 0 : i + 1; // increment iif (t[i] == null) q++;n++;t[i] = x;return true;}
By now, the implementation of the remove(x)
operation should be obvious. We search and so on until we find an index such that x
or null
. In the former case, we set del
and return true
. In the latter case we conclude that x
was not stored in the table (and therefore cannot be deleted) and return false
.
T remove(T x) {int i = hash(x);while (t[i] != null) {T y = t[i];if (y != del && x.equals(y)) {t[i] = del;n--;if (8 * n < t.length) resize(); // min 12.5% occupancyreturn y;}i = (i == t.length - 1) ? 0 : i + 1; // increment i}return null;}
The correctness of the find(x)
, add(x)
, and remove(x)
methods is easy
to verify, though it relies on the use of del
values. Notice that none of
these operations ever sets a non-null entry to null
. Therefore, when we
reach an index
such that = null
, this is a proof that the element, x
,
that we are searching for is not stored in the table; has always been
null
, so there is no reason that a previous add(x)
operation would have
proceeded beyond index
The resize()
method is called by add(x)
when the number of non-null entries exceeds or by remove(x)
when the number of
data entries is less than . The resize()
method works like the resize()
methods in other array-based data structures. We find the smallest non-negative integer d
such that . We reallocate the array t
so
that it has size
, and then we insert all the elements in the old version
of t
into the newly-resized copy of t
. While doing this, we reset q
equal
to n
since the newly-allocated t
contains no del
values.
void resize() {d = 1;while ((1 << d) < 3 * n) d++;T[] told = t;t = newArray(1 << d);q = n;// insert everything from toldfor (int k = 0; k < told.length; k++) {if (told[k] != null && told[k] != del) {int i = hash(told[k]);while (t[i] != null)i = (i == t.length - 1) ? 0 : i + 1;t[i] = told[k];}}}
Analysis of linear probing
Notice that each operation, add(x)
, remove(x)
, or find(x)
, finishes as soon
as (or before) it discovers the first null
entry in t
. The intuition behind the analysis of linear probing is that, since at least half the elements in t
are equal to null
, an operation should not take long to complete because
it will very quickly come across a null entry. We shouldn’t rely too heavily on this intuition, though, because it would lead us to (the incorrect)
the conclusion that the expected number of locations in t
examined by an
operation is at most 2.
For the rest of this section, we will assume that all hash values are
independently and uniformly distributed in . This is
not a realistic assumption, but it will make it possible for us to analyze
linear probing. Later in this section, we will describe a method, called
tabulation hashing, that produces a hash function that is “good enough”
for linear probing. We will also assume that all indexes into the positions
of t
are taken modulo t.length}
, so that t[i]
is really a shorthand for
We say that a run of length that starts at occurs when all the table entries ...
Create a free account to access the full course.
By signing up, you agree to Educative's Terms of Service and Privacy Policy