Hash Codes
Break down the components of hash codes.
We'll cover the following
The hash tables are used to associate data with integer keys consisting of bits. In many cases, we have keys that are not integers. They may be strings, objects, arrays, or other compound structures. To use hash tables for these types of data, we must map these data types to hash codes. Hash code mappings should have the following properties:
- If
x
andy
are equal, thenx.hash_code()
andy.hash_code()
are equal. - If
x
andy
are not equal, then the probability thatx.hash_code() = y.hash_code()
should be small (close to .
The first property ensures that if we store x
in a hash table and later
look up a value y
equal to x
, then we will find x
—as we should. The second property minimizes the loss from converting our objects to integers.
It ensures that unequal objects usually have different hash codes and so
are likely to be stored at different locations in our hash table.
Hash codes for primitive data types
Small primitive data types like char
, byte
, int
, and float
are usually
easy to find hash codes for. These data types always have a binary representation and this binary representation usually consists of or fewer
bits. In these cases, we just treat these bits as the representation of an
integer in the range . If two values are different, they get different hash codes. If they are the same, they get the same hash code.
A few primitive data types are made up of more than bits, usually bits for some constant integer . (Java’s long
and double
types are examples of this with .) These data types can be treated as compound objects made of parts, as described in the next section.
Hash codes for compound objects
For a compound object, we want to create a hash code by combining the individual hash codes of the object’s constituent parts. This is not as easy as it sounds. Although one can find many hacks for this (for example, combining the hash codes with bitwise exclusive-or operations), many of these hacks are easy to foil. However, if one is willing to do arithmetic with bits of precision, then there are simple and robust methods available. Suppose we have an object made up of several parts whose hash codes are . Then we can choose mutually independent random bit integers and a random bit odd integer and compute a hash code for our object with
Note that this hash code has a final step (multiplying by and dividing by
that uses the multiplicative hash function to take the bit intermediate result and reduce it to a -bit final result. Here is an example of this method applied to a simple compound object with three parts x0
, x1
, and x2
:
Create a free account to access the full course.
By signing up, you agree to Educative's Terms of Service and Privacy Policy