Databases are commonly grouped into two categories: relational and NoSQL (non-relational).

Relational databases

Relational databases enforce a predefined schema before storing data. Data is organized into relations (tables) composed of tuples (rows) and attributes (columns). Each tuple has a unique key. Because data is structured, a tuple in one table can link to a tuple in another using a foreign key.

Structured Query Language (SQL) is used to manage the database, handling data insertion, deletion, and retrieval.

Relational databases are dominant due to their simplicity, robustness, flexibility, and scalabilityTraditional databases are vertically scalable. Vertical scaling has limits. One might reach a point when more compute, memory, storage, or networking capability could not be added to a single node.. To maintain data integrity, relational databases provide (ACID) properties. ACID is a powerful abstraction that automatically handles complex anomalies (such as dirty reads, lost updates, and phantom reads), simplifying application logic.

However, ACID guarantees can be overkill for some use cases. If an application can tolerate specific anomalies, a custom solution might offer higher performance, though it adds implementation complexity.

The ACID offers the following properties:

Atomicity: Transactions are atomic units. Either all statements in a transaction execute successfully, or none do. Failed transactions are aborted and rolled back.
Consistency: The database remains in a consistent state before and after every transaction. It enforces rules to ensure data validity.
Isolation: Concurrent transactions do not affect each other. The final state of the database is the same as if the transactions were executed sequentially.
Durability: Once a transaction is committed, it survives permanently, even in the event of a system failure.

Popular database management systems (DBMS) for relational schemas include:

MySQL
Oracle Database
Microsoft SQL Server
IBM DB2
Postgres
SQLite

Why relational databases?

Relational databases are the default choice for structured data storage. Their primary strengths are the abstraction of ACID transactions and the standardization of programming semantics.

Flexibility: SQL’s data definition language (DDL)DDL is a computer language used to create and modify the structure of database objects in a database. allows schema modifications, such as adding tables, renaming columns, or altering structures, even while the database server is running and processing queries.
Reduced redundancy: Normalization eliminates data redundancy. Information about a specific entity is stored in one table and linked to other tables via foreign keys. This prevents inconsistent dependencyInconsistent dependencies arise when a user searches for some data in the irrelevant table or the data is missing in the relevant table. For example, to find the customer’s address it should be searched in the customer table instead of the product table. and ensures data is updated in only one place.
Concurrency: Enterprise applications often have many users reading and writing data simultaneously. Relational databases manage concurrency through transactions, preventing inconsistencies such as double-booking a hotel room.
Integration: A shared relational database allows multiple applications to access and aggregate data from a single source. The database handles concurrency control, enabling seamless integration between different systems.
Backup and disaster recovery: Relational databases guarantee consistent data states, simplifying export, import, and backup operations. Most cloud-based options offer continuous mirroring for quick restoration.

Drawback

The relational databases have the following drawback:

Impedance mismatch

Impedance mismatch occurs when the relational model (tables) and in-memory data structures (objects) are not aligned. Application code often uses complex, nested structures, while databases use flat tables. This requires a translation layer to map objects to rows, adding complexity and overhead.

Why non-relational (NoSQL) databases?

NoSQL databases are designed to handle diverse data models. They excel in applications requiring large volumes of semi-structured or unstructured dataThe schema of the data is not known at the time of writing the data. Schema is usually defined at the time of reading such data., low latency, and flexible schemas. They often achieve this by relaxing strict consistency restrictions.

Key characteristics:

Simple design: NoSQL avoids impedance mismatch. For example, a document store can save all employee data in a single document rather than splitting it across multiple joined tables. This simplifies coding and debugging.
Horizontal scaling: NoSQL databases are built to run on clusters. They automatically partition (shard) data across multiple nodes, making it easy to scale out as user traffic increases. If a node fails, it can be replaced without disrupting the application.
Availability: Data replication ensures high availability and disaster recovery. Node maintenance can often be performed without downtime.
Flexible schema: Many NoSQL databases do not require a predefined schema. For instance, JSON documents in the same collection can have different fields, allowing the data structure to evolve dynamically.
Cost: Many NoSQL databases are open source. They are typically designed to run on clusters of inexpensive commodity servers rather than on expensive, proprietary hardware.

NoSQL databases are categorized by their data models: document stores, columnar databases, key-value stores, and graph databases.

Types of NoSQL databases

The following sections describe the primary types of NoSQL databases.

Key-value database

Key-value databases store data as key-value pairs, similar to a hash table. The key serves as a unique identifier, and the value can range from a simple scalar to a complex object. These databases are highly partitionable and allow for easy horizontal scaling. Popular examples include Amazon DynamoDB, Redis, and Memcached.

Use case: Session-oriented applications. Web applications often store user session data (profiles, preferences, shopping carts) in memory or a database. Using a unique session ID as the key allows for extremely fast retrieval of the user’s state.

The figure below illustrates a key-value store where a composite key (Product ID + Type) maps to various item attributes.

Document database

Document databases store data in formats like XML, JSON, or BSON. Documents are hierarchical tree structures that can contain maps, collections, and scalar values. Unlike relational tables, documents within the same database can have varying structures. Examples include MongoDB and Google Cloud Firestore.

Use case: Suitable for unstructured catalog data. In e-commerce, products often have unique sets of attributes (e.g., a laptop has different specs than a shirt). Storing these in a relational database is inefficient. A document database stores all product attributes in a single file, improving management and read performance. They are also ideal for content management systems (such as blogs and video platforms).

The example below shows a JSON document storing a person’s attributes, such as id, name, and email.

Use case: These are commonly used in social networks and recommendation engines. They are optimized for querying complex relationships, such as traversing “friend-of-friend” connections or detecting patterns in user activity. They are also used in fraud detection, knowledge graph construction, and recommendation systems.

Columnar database

Columnar databases organize data by column rather than by row. This design optimizes the retrieval of specific fields across large datasets, making it ideal for analytics. Popular examples include Amazon Redshift and Google BigQuery.

Use case: Analytics and data warehousing. Operations such as calculating the average transaction amount over a year are efficient because the database reads only the “amount” column, ignoring unrelated columns like customer names.

Note: Do not confuse columnar databases with wide-column stores (e.g., Apache Cassandra, HBase). Columnar databases optimize for analytics (OLAP). Wide-column stores group data into column families to optimize for high-throughput writes and flexible, semi-structured data.

The figure below contrasts column-oriented storage with traditional row-oriented storage:

Relational Database	Non-Relational Database
If the data to be stored is structured	If the data to be stored is unstructured
If ACID properties are required	If there’s a need to serialize and deserialize data
If the size of the data is relatively small and can fit on a node)	If the size of the data to be stored is large

Types of Databases

Relational databases

Why relational databases?

Drawback

Impedance mismatch

Why non-relational (NoSQL) databases?

Types of NoSQL databases

Key-value database

Document database

Graph database

Columnar database

Drawbacks of NoSQL databases

Choose the right database

Databases

Quiz