Crashproofing the Original NoSQL Key-Value Store

queue.acm.org
6 min read
standard
Article URL: https://queue.acm.org/detail.cfm?id=3487353 Comments URL: https://news.ycombinator.com/item?id=28624365 Points: 1 # Comments: 0
September 19, 2021

Volume 19, issue 4

PDF

Drill Bits (#5)

Crashproofing the Original NoSQL Key-Value Store

An upgrade for the gdbm database

Terence Kelly

This episode of Drill Bits unveils a new crash-tolerance mechanism that vaults the venerable gdbm database into the league of transactional NoSQL data stores. We'll motivate this upgrade by tracing gdbm 's history. We'll survey the subtle science of crashproofing, navigating a minefield of traps for the unwary. We'll arrive at a compact and rugged design that leverages modern file-system features, and we'll tour the production-ready implementation of this design and its ergonomic interface.

Enduring Patterns for Durable Data

Ken Thompson wrote the original dbm amid a storied golden age. Sparked by Unix, software creativity flourished at Bell Labs in the 1970s, producing an ecosystem10 that remains vibrant decades later. The parent company's ambient technology may have inspired staff to re-imagine data organization, with profound consequences:

IBM OS builders worked in a place where data lived on unit record equipment (80‑column cards, 132‑column printers, etc.) and built a file system that looked like that. Ken and Dennis worked in a phone company where data traveled in streams over wires, and built a system in which files looked like that.3

XKCD 2324

Unstructured byte streams from files and pipes forced applications to impose structure. Many opted for the easygoing key-value paradigm—often a good choice if a relational database would be overkill and fixed layout would be too rigid. Enduring key-value patterns from the early years include self-describing text formats2 and the built-in associative arrays of AWK and its descendants.

Thompson's dbm library provided a persistent key-value abstraction.16 Client programs manipulated pairs of key and value strings via a store()/fetch() interface; under the hood dbm managed an extensible hash table embedded in a file.

Philip Nelson implemented GNU gdbm in…
Read full article