Key-Value Stores

Guriy Samarin
4 min readDec 19, 2024

--

Why: I’ve got an idea to make a series about modern key-value low latency data stores, focusing on Valkey. To set the stage properly, I’ll begin with fundamental definitions and historical context. So, let’s dive into the basics and core concepts first.

Roots

The key-value pair concept is present in many programming languages. Data types to store key-value pairs are called associative arrays, dictionaries, or hash tables. Using this paradigm to organise data storage leads us to key-value stores (KVS).

Definition: Key-value stores (also known as key-value databases) are a type of NoSQL database that use a key-value method to store and retrieve data. Each record consists of a key and a value. The key is a unique identifier used to fetch the associated value, which can be a simple data type or a more complex object.

Another perspective on KVS — fast storage without strict type constrains. In terms of the CAP theorem, KVS expected to be available and partition tolerant, but not consistent. Modern NoSQL databases of this type have various tricks to provide additional guarantees, but conceptually, that’s the gist.

Now, I’m supposed to declare that KVS is scalable, highly available, and whatnot. But that’s not necessarily true — it could be, depending on the specific store you’re looking at. It’s often mentioned that KVS is simple, but that’s also not entirely accurate. While they have a simple contract, popular solutions are far from simplicity. You might argue that any developer could build a simple KVS in a couple of hours by wrapping a dictionary in a service, and you’d be right. The problem is, the moment this service tries to keep some of the NoSQL promises another additional months of active development would be required.

The first KVS could be considered dbm (database manager), the Unix system library written in 1979. Currently, the list is extensive, but I’ll mention a few: Amazon DynamoDB, Valkey, Memcached, Redis, and ScyllaDB.

Why Use KVS

Relational databases (SQL) enforce a rigid structure on data, which comes at a price. Aggregating information stored in different tables could be relatively slow. A common response to this is creating views with pre-aggregated data or adding duplicate information to existing tables to avoid joins. You might also split tables or distribute your storage to achieve faster access times. If you find yourself denormalising data or sharding tables, it might be time to relax some of the guarantees RDBMS offers and find a better fit for your task. As I mentioned, KVS doesn’t promise you anything by definition, but chosen right NoSQL database you could have better results without rearchitecting the whole app.

Scalability

KVS can be very scalable because you have fewer constraints to maintain. For example, no schema needs to be updated across the board, or data can be inserted into one shard while another shard handles different data simultaneously (due to no indexing or uniqueness constraints). KVS vary in their nature, and not every one can be a drop-in replacement for another (just as you wouldn’t easily switch from MS SQL to Postgre SQL, changing from DynamoDB or MongoDB to Valkey is not straightforward as they solve different tasks).

Ease of Data Access or Lack of Schema Management

Key-value store design does not enforce a schema on developers. You can store any object (well not any, but requirements are very relaxed) and retrieve any object from KVS without worrying about storage structure and future usage considerations. Despite the wide adoption of JSON columns by major SQL players, you still need to maintain schema for tables and have proper indices in place, to ensure JSON query performance up to the standard. The lack of a tight schema means the application is responsible for properly interpreting the data it consumes, often referred to as schema on read.

Performance

Key-value databases handle constant read-write operations with low-overhead server calls. Improved latency and reduced response time enhance performance at scale. Performance is guaranteed for direct key access operations. However, if you attempt something like a join, you would experience much worse performance compared to relational databases.

Some Use Cases for KVS

Key-value databases can serve as the primary database for your application or handle niche requirements. Here are some example use cases:

  1. Caching: Store data temporarily for quicker access, such as frequently visited social media content or configuration settings. In-memory data caching systems leverage key-value stores to enhance application response times.
  2. Session Management: Session attributes, such as user profiles and messages, are accessed only via a lookup key, making a fast key-value store perfect for this purpose.
  3. Messaging: Typically, access to a specific message or chat is done via its ID. For search scenarios, fuzzy search is often used, where SQL doesn’t necessarily offer a clear advantage.

Key Takeaways

Key-value stores offer broad functionality, but the only guarantee is that you can store any data by key (it is dictionary-as-a-service, by definition). When making architectural decisions, base them on specific use cases and choose the appropriate solution. KVS vary significantly in their intent, and changing from one to another can be as challenging as switching from SQL to NoSQL.

References:

--

--

Guriy Samarin
Guriy Samarin

Written by Guriy Samarin

Software developer at Amazon. Web (mostly backend) development now. My stack — .NET (APS.NET Core MVC).

No responses yet