HashSets

Learn what hash sets are and how to use them in Ruby.

Listing all keys in a hash

There is a way in Ruby language to list all the keys in a hash. Here is how this method works:

$ pry
> hh = {}
=> {}
> hh[:red] = 'ff0000'
=> "ff0000"
> hh[:green] = '00ff00'
=> "00ff00"
> hh[:blue] = '0000ff'
=> "0000ff"
> hh.keys
=> [:red, :green, :blue]

We defined a hash and pushed a few key-value pairs in it. Keys are a type of symbol, and values are just strings. These strings (ff0000, 00ff00, and 0000ff) are conventional three-byte representations of the color code RGB, where the first byte is responsible for red (R) color, the second for green (G), and the third for the blue (B).

Getting the list of hash keys isn’t often required. However, there is a need to use keys only in a hash.

Sets

A programmer is free to use hash data structure and arbitrary data for values—like true, for example—but a special hash-like data structure is designed to keep keys only, without any values. The common name of this data structure is hash set. In Ruby, it’s represented by the Set class):

Note: Set implements a collection of unordered values with no duplicates.

In other words, a set is a collection of items that usually originate from a common source.

Let’s practice understanding a hash set a little bit more. Given an English sentence, find out if all the letters of the alphabet were used in this particular sentence. For example, the sentence “Quick brown fox jumps over the lazy dog” is commonly used to test typewriters, printers, fonts, and so on because it uses all the letters of the English alphabet. If we omit the first word, “quick”), we won’t find the letter “q” in the phrase “brown fox jumps over the lazy dog.”

Method example

We’ll create a method that will return true if all the letters of the English alphabet were used for the provided string, otherwise it will return false. How should we approach this problem? It’s actually pretty straightforward. We’ll iterate over each character in a given string, and if it’s not a space, we’ll add it to the hash, regardless of its existence in the hash, since keys in a hash are always unique and aren’t duplicated. Since there are no duplicates in a hash, we can only have 26 records maximum, one record for each letter of the English alphabet.

But there is something that feels off in this challenge. If we use a classic hash, we need to set keys and values. Value, in this case, will be useless:

hh[letter] = true

true, false, 1, 0 or any string is fine as the value in the line above because we do not check it later. We rely on the hash data structure and check the size property, but we never use the value. In other words, we’re wasting computer memory for the values we don’t need.

This is where a hash set data structure comes into play. Here is how our program listing looks when we use a hash set, represented by the Set class:

Get hands-on with 1400+ tech skills courses.