Design a URL Shortening Service/TinyURL
Learn to design a system similar to TinyURL.
Introduction
A
Several building blocks can be considered for the system design of TinyURL. Recognize four building blocks based on the following requisite functionalities in the design and provide your answer in the AI widget given below:
It’s necessary to:
Store the shortened URLs.
Provide unique IDs for each URL.
Store frequent URL-related requests to serve them efficiently.
Limit the number of incoming requests.
Requirements
We have categorized our requirements into the following two categories: functional and non-functional.
Functional requirements
We need to define some functionalities that we will need and some of the general requirements.
- Short URL generator: This functionality should generate a short and unique URL alias of the specified URL.
- Redirection: This functionality should redirect the short URL to the actual URL so the request can be successfully entertained.
- Customized short URL: This functionality should allow the user to generate a unique and customized short URL alias of the original URL.
- Deletion: This functionality should allow users to delete the generated short URL.
- Updation: This functionality should allow the user to update the original URL associated with it.
- Expiration time: This functionality should allow users to set the expiration time of the generated short URL.
Non-functional requirements
We have decided that our system should have the following non-functional behaviors:
- Availability: We want our system to be available at all times at any cost because we want our users to access any URL anytime.
- Scalability: We want our system to be capable enough to handle an increasing number of URLs.
- Readability: We want our system to generate short URLs that can easily be read, distinguished, and typed.
- Latency: We want our system to perform at low latency to give users a better experience.
- Unpredictability: We want our system to generate highly unpredictable and unique short URLs to increase security.
Design
Let’s discuss the main design components required for our URL shortening service. Our design depends on each part’s functionality and progressively combines them to achieve the different workflows mentioned in the functional requirements.
Components
We’ll explain the inner mechanism of different components within our system and their usage as a part of the whole system below. We’ll also highlight the design choices made for each component to achieve overall functionality.
Database: For services like URL shortening, there isn’t a lot of data to store. However, the storage has to be horizontally scalable. The types of data we need to store include:
- User details
- Mappings of the URLs (the long URLs that are mapped onto short URLs)
Since the stored records will have no relationships among themselves other than linking the URL-creating user’s details, so we don’t need structured storage for record keeping. Also, our system will be read-heavy, so NoSQL is a suitable choice for storing data. In particular, MongoDB is a good choice for the following reasons:
- It uses leader-follower protocol, making it possible to use replicas for heavy reading.
- MongoDB ensures atomicity in concurrent write operations and avoids collisions by returning a duplicate key errors for record-duplication issues.
Point to Ponder
Why are NoSQL databases like Cassandra or Riak not good choices compared to MongoDB?
Short URL generator: Our short URL generator will comprise a building block—a sequencer to generate unique IDs—and an additional component—a Base58 encoder to enhance the readability of the short URL.
We built a sequencer in the elementary design problems to generate 64-bit unique numeric IDs. However, our proposed design requires 64-bit alphanumeric short URLs in Base58. To convert the numeric (Base10) IDs to alphanumeric (Base58), we’ll need a Base10 for the Base58 encoder.
Look at the diagram below to understand how the short URL generation unit will work.
Besides the elements mentioned above, we’ll incorporate other elements, like load balancers, cache, and rate limiters.
-
Load balancing: We can employ Global Server Load Balancing (GSLB) apart from local load balancing to improve availability. Since we have plenty of time between a short URL being generated and subsequently accessed, we can safely assume that our database is geographically consistent and that distributing requests globally won’t cause any issues.
-
Cache: For our specific read-intensive design problem, Memcached is the best choice for a cache solution. We require a simple, horizontally scalable cache system with minimal data structure requirements. Moreover, we’ll have a datacenter-specific caching layer to handle native requests. Having a global caching layer will result in higher latency.
-
Rate limiter: Limiting each user’s quota is preferable for adding a security layer to our system. We can achieve this by uniquely identifying users through their unique
api_dev_key
. Considering our system’s simplicity and requirements, the fixed window counter algorithm would serve the purpose. We can assign a set number of shortening and redirection operations perapi_dev_key
for a specific timeframe.
Points to Ponder
How will we maintain a unique mapping if redirection requests can go to different datacenters that are geographically apart? Does our design assume that our database is consistent geographically?
Design diagram
A simple design diagram of the URL shortening system is given below.
Workflow
Let’s analyze the system in depth and how the individual pieces fit together to provide the overall functionality.
Considering the functional requirements, the workflow of the abstract design above would be as follows.
- Shortening: Each new request for short link computation gets forwarded to the short URL generator (SUG) by the application server. Upon successful generation of the short link, the system sends one copy back to the user and stores the record in the database for future use.
Points to Ponder
How does our system avoid duplicate short URL generation?
- Redirection: Application servers, upon receiving the redirection requests, check the storage units (caching system and database) for the required record. If found, the application server redirects the user to the associated long URL.
Point to Ponder
How does our system ensure that our data store will not be a bottleneck?
-
Deletion: A logged-in user can delete a record by requesting the application server, which forwards the user details and the associated URL’s information to the database server for deletion. A system-initiated deletion can also be triggered upon an expiration time, as we’ll see ahead.
-
Custom short links: This task begins with checking the eligibility of the requested short URL. The maximum length allowed is 11 alphanumeric digits. Once verified, the system checks its availability in the database. If the requested URL is available, the user receives a successful short URL generation message. They receive an error message in the opposite case.
The illustration below depicts how URL shortening, redirection, and deletion work.
Point to Ponder
Upon successful allocation of a custom short URL, how does the system modify its records?
Encoding and decoding process
We've discussed the overall design of a short URL generator (SUG) in detail, but two aspects need more clarification:
How does encoding improve the readability of the short URL?
How are the sequencer and the Base58 encoder in the short URL generation related?
Why to use encoding
Our sequencer generates a 64-bit ID in Base10, which can be converted to a Base64 short URL. Base64 is the most common encoding for ...