Latency and Throughput
Learn about the basic characteristics of a network that help to meet the latency and throughput goals.
Introduction
We often use high-level abstractions like sockets, message passing, remote procedure calls, or even more advanced constructs through API calls to simplify the complexities involved in network communications. On the other hand, applications have different needs in terms of user-perceived latency and throughput. For example, if we have a maximum budget of 500 ms for an API call, and the call needs to traverse cross-continental boundaries, the network consumes about 150 to 250 ms of the available 500 ms. This leaves about 250 ms for the actual processing on the server-side and any minute processing on the client-side.
Knowing what constitutes the latency component enhances our understanding and equips us to write better API service level agreements and some possible ways to reduce the latency. Similarly, multiple round-trips might be required to send data from the client to the server (and vice versa), which seems like a simple message or data at the application layer. Multiple rounds can add to the user-perceived latency. Additionally, fetching multiple objects using a certain version of a protocol might use independent TCP connections—incurring TCP connection making and cutting down costs. Name resolution using
We’ll first understand four important characteristics of the network—throughput, latency, jitter, and latency-bandwidth product. After that, we’ll classify applications primarily based on the delay so we can build a respective API that meets the needs of those applications, and so we know how much room we have as a designer.
Throughput
Throughput refers to the process-to-process logical data rate that inevitably passes through many network hops. If we have a 1 Mbps effective throughput of some link or network, we can’t send more than 1 Mb of data in a second. We need to send the remaining data in the next second. In a specific unit of time, the amount of data that can be transmitted from the sender to the receiver is measured by the throughput.
Let’s assume that some IoT device can only receive and process data at a 1 Mbps rate, and the network can support a rate of up to 2 Mbps. In that case, IoT devices are not able to fully utilize what the network is providing (2 Mbps).
Latency
Latency means how long (in terms of time) a user-level message takes to travel from the sender to the receiver. We’re often more concerned about the user-perceived delay, which is a function of the
Traditionally, throughput and latency have tradeoffs—for example, increasing the throughput or utilization might increase the latency as well. A detailed discussion on this topic is a subject of queuing theory, and is beyond the scope of this course.
Typically, network latency has the following constituent components:
Let's discuss them one by one. ...