Design the Data Model
Learn how to design the data model.
The data model follows the API design. A good place to start is with one table per resource. In this case, we would have a table that lists the collection of games.
Games data model
Each game would also have its own properties to capture the game state. This might all fit into one table, called games
, that has the following fields:
Field Name | Description |
---|---|
game_id |
A unique, non-guessable, ID for each game |
user_name |
The name of the user as submitted |
language |
The language that the game is using |
secret_word |
The secret word the user is trying to guess |
guessed |
Letters guessed so far by the user |
usage |
Usage example for the secret word |
reveal_word |
Blanks and revealed letters so far |
bad_guesses |
Number of incorrect guesses so far |
These fields contain enough information to track each game and its current state.
Trimming down the model
For each field, we can ask whether it is necessary and sufficient to create a working API. That is:
- Can the API respond without it?
- Does it store enough information so the server can compute the response to the client?
Clearly, the game_id
is needed. However, the user_name
and language
are, at this point, not used after the game is created. While they could be useful, there was nowhere in the story where the client needed to receive them. The secret_word
and guessed
are essential to a game since the server must determine whether guesses have been made before or not and whether they are correct. The usage
example is required only at the start of the game, but probably not thereafter. The reveal_word
and bad_guesses
fields could be computed from the other fields when needed. Accordingly, a minimal data model would look like this:
Field Name | Description |
---|---|
game_id |
A unique, non-guessable, ID for each game |
secret_word |
The secret word the user is trying to guess |
guessed |
Letters guessed so far by the user |
On the other hand, we could also wonder what additional information might be useful. In many websites, additional tracking information is later used to compute site metrics, like time on site. This would require additional fields that record when the user began the game, when the various guesses were made, and when the game ended. We might also capture an IP address for later use for geolocation, which can help us determine where our application users are coming from. We can also track the total number of games started and completed by a given user.
Calculating space required in the database
At the root of the data model design is a basic question. How copiously will we record API events? Are we tracking the minimal information or are we attempting to track every transaction? The main downside to tracking more is that we must write the code to do it, which must be tested, maintained, and documented in turn, and it takes up space in the database. For this second issue, a back-of-the-envelope calculation is often useful. How much storage does each record require, and how many of them will there be? We’re just looking for order-of-magnitude calculations here. As a rule of thumb, a UUID is about 40 bytes and a timestamp is around 20 bytes. We assume 10 letters are guessed in an average game. Adding up the pieces, we find it’s about half a kilobyte per record. Generally, numbers and text will not take up much room, but images, sounds, and video will.
We’ll be serving our application using Heroku’s free tier option. The main alternative to Heroku, Amazon’s RDS for PostgreSQL, is comparatively priced and more difficult to set up. Heroku’s free tier option is limited not by total storage but by the number of rows. The Hobby Dev Tier limits us to only 10,000 rows. There are other options that are available at different prices, with more storage according to your needs. For example, we would require more storage if we’re storing any images, audio, or video.
With the 10,000 row limit, we may want to try to pack everything into as few rows as possible. For example, we could have one row per user and store their latest several games across multiple columns. If our average user has played 20 games, this would allow us to have maybe 20 times as many games stored in our database. Or, we could just delete all but the latest game and get a similar effect, without creating a data model that would require overhauling when we expand to more storage. In general, if we’re planning on growing our application, we should avoid any data model contortions driven by fitting into the smallest number of rows that will be difficult to overcome later.
One place where we can save space is with read-only data, which we can store as files in our repository, rather than as records in the database. We will do this with the Lang-Man usage examples.
Get hands-on with 1200+ tech skills courses.