Suboptimal Solutions
Let’s learn methods to avoid EAV that can be used where inheritance methods cannot be used.
We'll cover the following...
There are two more solutions to model the subtypes that we are going to discuss in this lesson. These are:
- Semistructured data
- Post-processing
Semistructured data
If we have many subtypes or if we have to support new attributes frequently, we can add a BLOBBLOB data type, which we can use to store any binary data. Martin Fowler calls this pattern the Serialized LOB.
CREATE TABLE Issues (issue_id SERIAL PRIMARY KEY,reported_by BIGINT UNSIGNED NOT NULL,product_id BIGINT UNSIGNED,priority VARCHAR(20),version_resolved VARCHAR(20),status VARCHAR(20),issue_type VARCHAR(10), -- BUG or FEATUREattributes TEXT NOT NULL, -- all dynamic attributes for the rowFOREIGN KEY (reported_by) REFERENCES Accounts(account_id),FOREIGN KEY (product_id) REFERENCES Products(product_id));
The advantage of this design is that it’s completely extensible. We can store new attributes in the blob at any time. Every row stores a potentially distinct set of attributes, so we have as many subtypes as we have rows.
The disadvantage is that SQL has little support for accessing specific attributes in such a structure. We can’t easily select individual attributes within the blob for row-based restriction, aggregate calculation, sorting, or other operations. We must fetch the whole blob of attributes as a single value and write application code to decode and interpret the attributes.
This design is best when we can’t limit ourselves to a finite set of subtypes and when we need complete flexibility to define new attributes at any time.
Post-processing
Unfortunately, sometimes we’re stuck with the EAV design — for example, if we inherited a project and can’t change it or if our company acquired a third-party software platform that uses EAV. If this is the case, it’s best to familiarize ourselves with the trouble areas in the previous lesson so we can anticipate and plan for the extra work it takes to work with this design.
Above all, we must not try to write queries that fetch entities as a single row as though data were stored in a conventional table. Instead, we should query the attributes associated with the entity and fetch them as a set of rows, like they are stored.