...

/

Tenant-specific Optimization in Tectonic

Tenant-specific Optimization in Tectonic

Learn how to perform tenant-specific optimizations in Tectonic.

In the previous lesson, we discussed how we used multitenancy to fairly and efficiently share IOPS and storage capacity. Earlier tenants were using different strategies to store data reliably. Some were using full data replication for speedy writes and reads, while others were using the Reed-Solomon encoding to reduce storage needs, but at the cost of added latency (that will be needed to encode data while writing and decoding data while reading).

We allowed multiple tenants with various workload characteristics and performance requirements to work on the same shared storage. We’ll enable tenants to request their required storage mechanism via the Client Library that we discussed earlier in our design.

Overview

Now, we consider two tenants, data warehousing and blob store, as our examples to explain the specific storage consumption or latency-related optimizations. The following are the two ways to perform tenant-specific optimization for low latency and storage efficiency.

  1. Optimizing writes on data warehouse: We need to optimize how large-scale data can be stored using full-block operations. Since the data is large, we can’t use partial block operations because, in such use cases, partial blocks will increase the latency and decrease storage efficiency.

  2. Optimizing blob storage: We need to optimally store the small-scale data (blobs) as well, where blob storage comes in. Since the data is not at a large scale, we’ll perform partial-block operations on both hot and warm blobs.

The following illustration shows the summary of the optimizations for both tenants.

Optimizing writes on data warehouse

Writing data once and reading it many times later is a dominant pattern in workloads for data warehouses. The file is only accessible to readers for certain workloads after the file is closed by its creator. As a result, the file becomes immutable forever. Since the data can only be read once the file creator is done writing, we prioritize write requests with low latency over the low latency append requests.

Since we will perform write operations on a large amount of data, these write operations will be on the full block. To optimize such write operations, we have used the following two ways:

  1. Reed Solomon(RS)-encoded asynchronous writes: This uses write-once-read-many for optimal network, storage, and IO performance.

  2. Hedged quorum writes: This is for generating reservation requests to decrease latency.

RS-encoded asynchronous writes

The write-once-read-many design pattern is used by Tectonic to decrease the overall file write time while increasing IO and network performance. Applications can buffer writes up to the block size because this approach doesn’t involve partial file reads. The blocks are subsequently RS-encoded by applications, and then the storage nodes store the chunks of data in them, as shown below.

The data that will be lived longer is normally encoded using RS(15,9)RS(15,9), while the data that will be lived for a shorter period of time is usually encoded with RS(6,3)RS(6,3). In RS(n,k)RS(n,k), nn is the total number of bits after encoding, whereas kk is the original data bits, and its decoder can perform error correction up to tt symbols where t=nk2t = \frac{n - k}{2}. In RS(15,9)RS(15,9), where ...