...

/

Copy of Put Back-of-the-Envelope Numbers in Perspective

Copy of Put Back-of-the-Envelope Numbers in Perspective

Using realistic numbers in back-of-the-envelope calculations.

A distributed system has compute nodes, connected via a network. There is a wide variety of available compute nodes and they can be connected in many different ways. Back-of-the-envelope calculations help us to ignore nitty-gritty details of the system (at least at the design level) and to focus on more important aspects.

An example of the back-of-the-envelope calculation could be:

  • Number of concurrent TCP connections a server can support
  • Latency of a cross-continental link connecting two data centers

Choosing an unreasonable number for such calculations can irk the interviewer and send a wrong message that we are not aware of any real system.

Overview of the system

Let’s consider an example of an online classified platform, similar to OLX which is the world’s leading classifieds platform in growth markets and is available in more than 40 countries and over 50 languages.

Our platform will allow people to buy, sell or exchange a wide variety of used goods and services by enabling them to post a listing through their mobile phone or on the web in a fast and convenient manner.

Server requirements

Before we dig deep into the server requirements and assess different design options, let’s first look at some numbers to get an idea of the time taken by some of the routine computer operations.

The following tables shows the typical latency, throughput numbers, and server capabilities to guide our back-of-the-envelope calculations.

Note: Some of these numbers may be a little outdated considering all the advancements in modern day computers. However, they are enough to give us some idea about the time taken by some of the routine computer operations.

Latency Numbers

Component

Time (nano seconds)

Comment

L1 cache reference

0.5


Branch mispredict

5


L2 cache reference

7

14x L1 cache

Mutex lock/unlock

25


Main memory reference

100

20x L2 cache, 200x L1 cache

Compress 1K bytes with Snzip

3,000 (3 micro-seconds)


Send 1K bytes over 1 Gbps network

10,000 (10 micro-seconds)


Read 4K randomly from SSD with speed ~1GB/sec SSD

150,000 (150 micro-seconds)


Read 1 MB sequentially from memory

250,000 (250 micro-seconds)


Round trip within same datacenter

500,000 (500 micro-seconds)


Read 1 MB sequentially from SSD with speed ~1GB/sec SSD

1,000,000 (1 milli-seconds)

4X memory

Disk seek

10,000,000 (10 milli-seconds)

20x datacenter roundtrip

Read 1 MB sequentially from disk

20,000,000 (20 milli-seconds)

80x memory, 20X SSD

Send packet CA->Netherlands->CA

150,000,000 (150 milli-second)


Throughput Numbers


Throughput (Mbps)

Comment

Across virtual machines on the same physical server

0.5


Out of local server

5


Between two system on the same rack

7


Between two systems on different racks

25


Typical cross-sectional bandwidth

10,000 (10Gbps)

40 ro 100 Gbps becoming common

Google's Jupiter fabrics based network can support 1 Petabit/sec bisectional bandwidth

Near-by data centers usually have high bandwidth as compared to long-haul


Typical aggregate bandwidth between two datacenters

10,000 (10 Gbps) and onwards


SSD Read/Write



Disk Read/Write



Server Capabilities

Component

Count

Comment

Number of processors

2


Number of cores

160


RAM

Up to 8TB


Number of disks

Up to 24


Network interfaces

LOM: OCP3.0 NIC PCIe x16 Gen4 

Management: 1x 1GbE from BMC to Rear RJ45 Connector

Lights-out management (LOM): It is a form of out-of-band management that enables system administrators to monitor and maintain by connecting remotely to servers and other data center infrastructure.

No. of servers required

Let’s assume we have a total of 350M active users and a single user makes 20 requests per minute on average.

  • This will give us a total of 350M * 20 = 7B requests per day or approximately 81K requests per second which is out queries per second (QPS) for our system.
  • Assuming an average request takes 50 milliseconds of compute time on a single core, each server will be able to process 20 requests per second. Since our server has 160 cores, therefore it will be able to process 320 requests per second.
  • Putting these numbers together, we will need approximately 81K / 320 ~= 260 servers to handle all requests.

Cache

Caching refers to storing of data to serve future requests for that data faster. That cached data may be a copy of data stored elsewhere or a result of an earlier operation.

For our system, we can use a number of ...

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy