Home/Blog/System Design/Design TinyURL and Instagram: System design interview tutorial
Home/Blog/System Design/Design TinyURL and Instagram: System design interview tutorial

Design TinyURL and Instagram: System design interview tutorial

Aaron Xie
15 min read

Note: This post was originally published in 2020 and has been updated as of Nov. 15, 2021.

Understanding system design is more important than ever in today’s tech industry. The larger an application gets, the more reliable and efficient architectural patterns need to be. System design is likely a requirement to land your dream software engineering job.

Today, we’ll apply system design principles by approaching two common system design interview questions: designing TinyURL and Instagram. You’re in the right place if you’re here to prepare for a system design interview.

Master system design beyond the interview

Our comprehensive system design course covers these design problems – and much more.

Grokking Modern System Design


widget

How to design TinyURL#

What is TinyURL#

TinyURL is a URL shortening web service that creates shorter aliases for long URLs. Users select these shortened URLs and are redirected to the original URL. This service is useful because short links save space and allow users to easily type abbreviated URLs instead of long ones.

System requirements and goals#

Let’s consider some functional and non-functional requirements for designing TinyURL.

Non-functional requirements:

  • The system must be highly available. The short links will not function if the service fails.

  • URL redirection should happen in real-time with minimal latency.

  • Shortened links should not be predictable in any manner.

Functional requirements:

  • Our service will generate a shorter alias of the original URL that can easily be copied and pasted.

  • The short link should redirect users to the original link.

  • Users should have the option to pick a custom short link for their URL.

  • Short links will expire after a default timespan, which users can specify.


Capacity estimation and constraints#

Our system will be read-heavy. There will be a large number of redirection requests compared to new URL shortenings. Let’s assume a 100:1 read/write ratio. Our reads are redirection requests and writes are new URL shortenings.

Traffic estimates: Let’s assume we have 500 million new URL shortenings per month. Based on our 100:1 read/write ratio, we can expect 50 billion redirections during the same period:

100500million=50billion100 * 500 million = 50 billion

We’ll now determine our system’s Queries Per Second (QPS). We’ll take the monthly amount of 50 billion to calculate the new URLs shortenings per second:

500million/(30days24hours3,600seconds)= 200URLs/second500 million / (30 days * 24 hours * 3,600 seconds) = ~200 URLs/second

Applying the 100:1 read/write ratio again, URL redirections per second will be:

100200URLs/second=20,000/second100 * 200 URLs/second = 20,000/second

Storage estimates: Let’s assume we store every URL shortening request and its associated link for five years. Since we expect to have 500 million new URLs every month, the total number of objects we expect to store over five years will be 30 billion:

500million12months5years=30billion500 million * 12 months * 5 years = 30 billion

For now, let’s assume that each stored object will be approximately 500 bytes. This means we’ll need 15TB of total storage for the five year period:

30billion500bytes=15TB30 billion * 500 bytes = 15 TB

Bandwidth estimates: We expect 200 new URLs per second for write requests. This makes our service’s total incoming data 100KB per second:

200500bytes=100KB/s200 * 500 bytes = 100 KB/s

For read requests, we expect approximately 20,000 URL redirections every second. Then total outgoing data for our service would be 10MB per second:

20,000500bytes= 10MB/s20,000 * 500 bytes = ~10 MB/s

Memory estimates: We’ll need to determine how much memory we’ll need to store the frequently accessed hot URLs’ cache. If we follow the 80-20 rule, then 20% of URLs will generate 80% of traffic. We would like to cache this 20% of hot URLs.

Since we have 20,000 requests per second, we will be getting 1.7 billion requests per day:

20,0003,600seconds24hours= 1.7billion20,000 * 3,600 seconds * 24 hours = ~1.7 billion

To cache 20% of these requests, we will need 170GB of memory.

0.21.7billion500bytes= 170GB0.2 * 1.7 billion * 500 bytes = ~170GB

It’s worth noting that there will be a lot of duplicate requests for the same URL. This means our actual memory usage will be less than 170GB.


System APIs#

We can use REST APIs to expose our service’s functionality. This example shows the API’s definition for creating and deleting URLs without service:

createURL(api_dev_key, original_url, custom_alias=None, user_name=None, expire_date=None)

Parameters:

api_dev_key (string): A registered account’s API developer key. This will be used to throttle users based on their allocated quota.

original_url (string): Original URL to be shortened.

custom_alias (string): Optional custom key for the URL.

user_name (string): Optional user name to be used in the encoding.

expire_date (string): Optional expiration date for the shortened URL.

Returns (string): A successful insertion returns the shortened URL. Otherwise, it returns an error code.

deleteURL(api_dev_key, url_key)

The url_key is a string that delineates the shortened URL that needs to be deleted. A successful deletion will return URL Removed.


Master system design beyond the interview

Our comprehensive system design course covers these design problems – and much more.

Grokking Modern System Design: Design TinyURL


Database design#

Let’s look at some observations about the data we are storing:

  • The service needs to store billions of records.

  • The service is read-heavy.

  • Each object is small (<1,000).

  • There are no relationships between each record, except for storing which user created the short link.

Schema:

We’ll now consider database schemas. We need a table for storing information on URL mappings as well as a database for data on users who created short links.

What type of database do we use?

The best choice would be to use a NoSQL database store like DynamoDB or Cassandra since we are storing billions of rows with no relationships between the objects.

Basic system design and algorithm#

Our service’s most crucial concern is how we’ll generate a short and unique key when given a URL. For this example, we’ll go about this by encoding the URL.

We can compute a given URL’s unique hash, such as MD5 or SHA256. The hash can then be encoded for display. This encoding could be base36, base62, or base64.

We’ll use base64 encoding. Now we’ll consider whether the short key should be six, eight, or 10 characters long.

  • Using base64 encoding, a six-character key would yield 64^6 = ~68.7 billion possible strings.

  • Using base64 encoding, an eight-character key would result in 64^8 = ~281 trillion possible strings

Let’s assume the six-character key is sufficient for our system. Our hash function will produce a 128-bit hash value if we use the MD5 algorithm. Each base64 character encodes six bits of the hash value. This means we’ll get a string of over 21 characters after encoding. We’ll need to take a different approach for choosing our key because we only have space for 8 characters per short key.

We can take the first six (or eight) letters for the key. We’ll need to take steps to avoid key duplication. We might swap some characters or choose characters not already in the encoding string.

There are some potential obstacles when taking this approach:

  • If multiple users enter the same URL, they will get the same short link.

  • Parts of the URL can be URL-encoded.

Workarounds: One solution to the aforementioned problems would be appending a number from a sequence to each short link URL. This would make each link unique, even if multiple users provide the same URL. Another potential solution is to append the user ID to the input URL. However, this would only work if the user was signed in. We would also need to generate a uniqueness key.

Data partitioning and replication#

Our database will store information about billions of URLs. We’ll need to partition it to make it scalable.

  • Range-based partitioning: We can store the URLs in separate partitions based on the first letter of its hash key. So, we save all the URLs with the first letter of their hash key being A in one partition, and so on.

This approach is problematic because it leads to unbalanced database servers. This in turn will create unequal load.

  • Hash-based partitioning: We can take the hash of a stored object and calculate which partition to use. The hashing function will randomly distribute data into different partitions.

This approach sometimes leads to overloaded partitions. If it does, you could use consistent hashing to resolve it.

Cache#

Our service should be able to cache URLs that are frequently accessed. We could use a solution like Memcached to store full URLs with respective hashes.

Cache memory requirements: We can start with around 20% of the daily traffic and adjust based on usage patterns. Our previous estimations tell us we’ll need 170GB of memory to cache 20% of the daily traffic.

Cache eviction policy: We want to replace a link with a more popular URL. We can use the Least Recently Used (LRU) policy for our system.

Load balancing#

We can add a load balancing layer to our system in three places:

  • Between clients and application servers

  • Between application servers and database servers

  • Between application servers and cache servers

We can start with a Round Robin approach to equally distribute incoming requests among servers. This is easy to implement and does not create any overhead.

It’s important to note that the load balancer won’t stop sending new requests to servers that become overloaded when using this approach. To avoid overload, a more intelligent load balancer solution can be developed to periodically adjust traffic based on load per server.


widget

Designing Instagram#

What is Instagram?#

Instagram is a social media platform that allows users to share photos and videos with other users. Instagram users can share information either publicly or privately.

We’ll design a simple version of Instagram that enables users to share photos, follow each other, and access the news feed. Users will see the top photos of the people they follow on their news feed.

System requirements and goals#

Let’s consider some functional and non-functional requirements for designing Instagram.

Non-functional requirements:

  • The service should be highly available.

  • The system’s acceptable latency should be around 200 milliseconds for the news feed.

  • The system should so reliable that photos and videos in the system are never lost.

Functional requirements:

  • The user should be able to search based on photo or video titles.

  • The user should be able to upload, download, and view photos and videos.

  • The user should be able to follow other users.

  • The system should be able to generate a displayed news feed consisting of the top photos and videos of the people that the user follows.

Functions that are not within the scope of this project are adding tags to photos, searching for photos with tags, commenting on photos, and tagging users.

Master system design beyond the interview

Our comprehensive system design course covers these design problems – and much more.

Grokking Modern System Design for Software Engineers & Managers: Design Instagram


Capacity estimation and constraints#

We will base our approach off these numbers:

  • 500 million total users

  • 1 million daily active users

  • 2 million new photos every day (at a rate of 23 new photos/second)

  • Average photo file size of 200KB

  • Total space required for 1 day of photos:

2million200KB=>400GB2 million * 200KB => 400 GB

  • Total space required for 10 years:

400GB365days10years= 1,425TB400GB * 365 days * 10 years = ~1,425TB


High-level system design#

The system should be able to support users as they upload and view each other’s media. This means our service needs servers to store media as well as another database server to store the media’s metadata.

Database schema#

We could take a straightforward approach by storing the above schema in a Relational Database Management System (RDBMS). However, challenges often arise when using a relational database for a scaling application.

We can store the above schema with key-value pairs utilizing a NoSQL database. The metadata for the photos and videos will belong to a table where the key would be the PhotoID, and the value would be an object containing PhotoLocation, UserLocation, CreationTimestamp, and so on.

We can use a wide-column datastore, Cassandra, to store the relationships between users and photos as well as a list of a user’s followed accounts. The UserPhoto table’s key would be UserID. The value would be the list of PhotoIDs the user owns. These will be stored in different columns. This pattern will be similar to the UserFollow table.

The photos and videos can be stored in a distributed file storage like HDFS.


Data size estimation#

Let’s estimate how much data will go into each table and how much total storage we’ll need for 10 years.

User: Each row in this table will be 68 bytes, assuming each int and dateTime is four bytes:

UserID(4bytes)+Name(20bytes)+Email(32bytes)+DateOfBirth(4bytes)+CreationDate(4bytes)+LastLogin(4bytes)=68bytesUserID (4 bytes) + Name (20 bytes) + Email (32 bytes) + DateOfBirth (4 bytes) + CreationDate (4 bytes) + LastLogin (4 bytes) = 68 bytes

If we have 500 million users, we’ll need 32GB of total storage:

500million68= 32GB500 million * 68 =~ 32GB

Photo: Each row in this table will be 284 bytes:

PhotoID(4bytes)+UserID(4bytes)+PhotoPath(256bytes)+PhotoLatitude(4bytes)+PhotLongitude(4bytes)+UserLatitude(4bytes)+UserLongitude(4bytes)+CreationDate(4bytes)=284bytesPhotoID (4 bytes) + UserID (4 bytes) + PhotoPath (256 bytes) + PhotoLatitude (4 bytes) + PhotLongitude(4 bytes) + UserLatitude (4 bytes) + UserLongitude (4 bytes) + CreationDate (4 bytes) = 284 bytes

If 2 million new photos get uploaded every day, we’ll need 0.5GB of storage for one day:

2million284bytes =0.5GB/day2 million * 284 bytes ~= 0.5GB/day

For 10 years we will need 1.88TB of storage.

UserFollow: Each row in this table will be eight bytes. We have 500 million users. If each user follows an average of 500 other users, we’ll need 1.82TB of storage for the UserFollow table:

500millionusers500followers8bytes =1.82TB500 million users * 500 followers * 8 bytes ~= 1.82TB

Total space required for all tables for 10 years will be 3.7TB:

32GB+1.88TB+1.82TB =3.7TB32GB + 1.88TB + 1.82TB ~= 3.7TB


Component design#

Photo uploads are often slower than reads because they go to the disk. Uploading users will consume all the available connections because of how slow the process is. Reads cannot be served when the system is loaded with write requests. To handle this bottleneck, we can split reads and writes to separate servers so that the system is not overloaded.

This will allow us to optimize each operation efficiently.

Reliability and redundancy#

Creating redundancy in the system allows us to create a backup in the midst of a system failure. We can’t lose any files and need a highly reliable application.

We’ll achieve this by storing multiple copies of each photo and video so the system can retrieve media from a copy server in the even that a server dies. We’ll apply this design to the other components of our architecture. With multiple copies of services in the system, the system will run even if a service dies.

Data sharding#

One possible scheme for a metadata sharding service is to partition based on PhotoID.

To solve the above problems, we can generate unique PhotoIDs and then find a shard number through PhotoID % 10. We would not need to append ShardID with PhotoID in this case because PhotoID will be unique throughout the system.

Generating PhotoIDs#

We won’t be able to define PhotoID by having an auto-incrementing sequence in each shard. Wee need to know PhotoID in order to find the shard where it will be stored. One solution could be to dedicate a separate database instance to generate auto-incrementing IDs.

If our PhotoID can fit into 64 bits, we can define a table containing only a 64 bit ID field. Whenever we want to add a photo in our system, we can insert a new row in this table and take that ID as the PhotoID of our new photo.

Ensuring reliability#

This key-generating database could be a single point of failure. A workaround for that could be defining two such databases with one generating even-numbered IDs and the other odd-numbered. For MySQL, the following script can define such sequences:

KeyGeneratingServer1:
auto-increment-increment = 2
auto-increment-offset = 1
KeyGeneratingServer2:
auto-increment-increment = 2
auto-increment-offset = 2

We can put a load balancer in front of both of these databases. We can Round Robin between them to deal with downtime. One of these servers could generate more keys than the other. Luckily, if they are out of sync in this way, it won’t cause an issue for our system. We can extend this design by defining separate ID tables for Users, Photo-Comments, or other objects present in our system.

Load balancing#

The service would need a large-scale photo delivery system to serve data to users globally. We could push the content closer to users using cache servers that are geographically distributed.


Wrapping up and resources#

Well done! You should now have a good idea of how to design an application like Instagram and TinyURL. There are still many more system design patterns to learn.

To help you master system design for your interview and beyond, we’ve created Grokking Modern System Design for Software Engineers & Managers. This course dives deep into other design problems, such as Facebook Messenger, Twitter, and more. But going beyond your interview, it’s a comprehensive course covering the fundamentals system design for developers and engineering managers alike.

Happy learning!


Continue reading about system design#


  

Free Resources