Summary

This lesson presents a summary to this chapter.

Flash-based SSDs are becoming a common presence in laptops, desktops, and servers inside the datacenters that power the world’s economy. Thus, you should probably know something about them, right?

Here’s the bad news: this chapter (like many in this course) is just the first step in understanding the state of the art. Some places to get some more information about the raw technology include:

  • research on actual device performance (such as that by Chen et al.“Understanding Intrinsic Characteristics and System Implications of Flash Memory based Solid State Drives” by Feng Chen, David A. Koufaty, and Xiaodong Zhang. SIGMET- RICS/Performance ’09, Seattle, Washington, June 2009. An excellent overview of SSD performance problems circa 2009 (though now a little dated). and Grupp et al.“Characterizing Flash Memory: Anomalies, Observations, and Applications” by L. M. Grupp, A. M. Caulfield, J. Coburn, S. Swanson, E. Yaakobi, P. H. Siegel, J. K. Wolf. IEEE MICRO ’09, New York, New York, December 2009. Another excellent characterization of flash performance.),

  • issues in FTL design (including works by Agrawal et al.“Design Tradeoffs for SSD Performance” by N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. Manasse, R. Panigrahy. USENIX ’08, San Diego California, June 2008. An excellent overview of what goes into SSD design., Gupta et al.“DFTL: a Flash Translation Layer Employing Demand-Based Selective Caching of Page-Level Address Mappings” by Aayush Gupta, Youngjae Kim, Bhuvan Urgaonkar. ASPLOS ’09, Washington, D.C., March 2009. This paper gives an excellent overview of different strategies for cleaning within hybrid SSDs as well as a new scheme which saves mapping table space and improves performance under many workloads., Huang et al.“An Aggressive Worn-out Flash Block Management Scheme To Alleviate SSD Performance Degradation” by Ping Huang, Guanying Wu, Xubin He, Weijun Xiao. EuroSys ’14, 2014. Recent work showing how to really get the most out of worn-out flash blocks; neat!, Kim et al.“A Space-Efficient Flash Translation Layer For Compact Flash Systems” by Jesung Kim, Jong Min Kim, Sam H. Noh, Sang Lyul Min, Yookun Cho. IEEE Transactions on Consumer Electronics, Volume 48, Number 2, May 2002. One of the earliest proposals to suggest hybrid mappings., Lee et al.“A Log Buffer-Based Flash Translation Layer by Using Fully-Associative Sector Translation. ” Sang-won Lee, Tae-Sun Chung, Dong-Ho Lee, Sangwon Park, Ha-Joo Song. ACM Transactions on Embedded Computing Systems, Volume 6, Number 3, July 2007 A terrific paper about how to build hybrid log/block mappings., and Zhang et al.“De-indirection for Flash-based SSDs with Nameless Writes” by Yiying Zhang, Leo Prasath Arulraj, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. FAST ’13, San Jose, California, February 2013. Our research on a new idea to reduce mapping table space; the key is to re-use the pointers in the file system above to store locations of blocks, instead of adding another level of indirection.), and even distributed systems comprised of flash (including Gordon“Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications” by Adrian M. Caulfield, Laura M. Grupp, Steven Swanson. ASPLOS ’09, Washington, D.C., March 2009. Early research on assembling flash into larger-scale clusters; definitely worth a read. and CORFU“CORFU: A Shared Log Design for Flash Clusters” by M. Balakrishnan, D. Malkhi, V. Prabhakaran, T. Wobber, M. Wei, J. D. Davis. NSDI ’12, San Jose, California, April 2012. A new way to think about designing a high-performance replicated log for clusters using Flash.),

  • and, if we may say so, a really good overview of all the things you need to do to extract high performance from an SSD can be found in a paper on the “unwritten contract”“The Unwritten Contract of Solid State Drives” by Jun He, Sudarsun Kannan, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. EuroSys ’17, Belgrade, Serbia, April 2017. Our own paper which lays out five rules clients should follow in order to get the best performance out of modern SSDs. The rules are request scale, locality, aligned sequentiality, grouping by death time, and uniform lifetime. Read the paper for details!.

Don’t just read academic papers; also read about recent advances in the popular press“Understanding TLC Flash” by Kristian Vatto. AnandTech, September, 2012. Available: http://www.anandtech.com/show/5067/understanding-tlc-nand. A short description about TLC flash and its characteristics.. Therein you’ll learn more practical (but still useful) information, such as Samsung’s use of both TLC and SLC cells within the same SSD to maximize performance (SLC can buffer writes quickly) as well as capacity (TLC can store more bits per cell). And this is, as they say, just the tip of the iceberg. Dive in and learn more about this “iceberg” of research on your own, perhaps starting with Ma et al.’s excellent (and recent) survey“A Survey of Address Translation Technologies for Flash Memories” by Dongzhe Ma, Jianhua Feng, Guoliang Li. ACM Computing Surveys, Volume 46, Number 3, January 2014. Probably the best recent survey of flash and related technologies.. Be careful though; icebergs can sink even the mightiest of ships“List of Ships Sunk by Icebergs” by Many authors. Available at this location on the “web”: http://en.wikipedia.org/wiki/List_of_ships_sunk_by_icebergs. Yes, there is a wikipedia page about ships sunk by icebergs. It is a really boring page and basically everyone knows the only ship the iceberg-sinking-mafia cares about is the Titanic..

ASIDE: KEY SSD TERMS

  • A flash chip consists of many banks, each of which is organized into erase blocks (sometimes just called blocks). Each block is further subdivided into some number of pages.
  • Blocks are large (128KB–2MB) and contain many pages, which are relatively small (1KB–8KB).
  • To read from flash, issue a read command with an address and length; this allows a client to read one or more pages.
  • Writing flash is more complex. First, the client must erase the entire block (which deletes all information within the block). Then, the client can program each page exactly once, thus completing the write.
  • A new trim operation is useful to tell the device when a particular block (or range of blocks) is no longer needed.
  • Flash reliability is mostly determined by wear out; if a block is erased and programmed too often, it will become unusable.
  • A flash-based solid-state storage device (SSD) behaves as if it were a normal block-based read/write disk; by using a flash translation layer (FTL), it transforms reads and writes from a client into reads, erases, and programs to underlying flash chips.
  • Most FTLs are log-structured, which reduces the cost of writing by minimizing erase/program cycles. An in-memory translation layer tracks where logical writes were located within the physical medium.
  • One key problem with log-structured FTLs is the cost of garbage collection, which leads to write amplification.
  • Another problem is the size of the mapping table, which can become quite large. Using a hybrid mapping or just caching hot pieces of the FTL are possible remedies.
  • One last problem is wear leveling; the FTL must occasionally migrate data from blocks that are mostly read in order to ensure said blocks also receive their share of the erase/program load.

Get hands-on with 1400+ tech skills courses.