How to design an optimized SQL schema

The first step in making your Database (MySQL / SQL) query more performant is to optimize the schema.

Here are a few generic rules that come in handy:

Smaller & Simple Datatypes

Smaller & Simple datatype takes less space on memory, disk, and CPU cache with fewer CPU cycles to process. Choose the smallest datatype that you don’t think will be exceeded. For example:

  • VARCHAR(12) for firstName instead of VARCHAR(1000).

  • Use INT to store integers instead of strings

  • Use Date type instead of storing dates as strings.

Avoid NULL

Always define a column as NOT NULL if you can. It’s harder for MySQL to optimize queries for NULL columns as it needs more space and requires special processing. Replace it with 0, empty string, or any special value – specially indexed columns shouldn’t be nullable.

Real Numbers

Float & Double are floating-point calculations performed by the CPU that are somewhat faster than DECIMAL.

For DECIMAL you can specify maximum digits before and after the decimal point. Specifying Precision for floating-point types is not recommended. Only use DECIMAL when you need exact calculations, e.g., financial data.

Strings

Use CHAR to store very short strings or column values that are nearly the same length. CHAR is also better if data is changed frequently. VARCHAR is good when updates to the column are rare and the maximum column length is much larger than the average length.

Tip: If the max string is ‘hello’, use VARCHAR(5) instead of VARCHAR(250).

BINARY strings are faster, but they store Bytes instead of characters.

Avoid BLOB and TEXT field types.

ENUM

ENUM strings are stored compactly so that the table size is smaller. The downside is we have to change the allowable strings we need ALTER TABLE; however, ENUM is not a good idea for frequent changes as it is slower to JOIN ENUM with the VARCHAR column, but joining ENUM to ENUM is faster.

Datatype and Identifiers

Primary Keys & Foreign Keys

Choose one datatype and keep it exact in all tables.

Integers are the best choice because they have AUTO_INCREMENT and are faster. ENUM/SET are not good choices and Strings are also not preferable because they slow INSERT & SELECT queries.