Synopsis: ID Required

Let’s learn about an antipattern that specifies the importance of using specific keys in a database.

Recently, a software developer asked me a question about preventing duplicate rows. It’s a question I get asked frequently and the first response that comes to mind is to ask the developer to check if she lacks a primary key. But in this case, this wasn’t the problem.

The developer in this case had stored articles for publishing on a website in his content management database. He had used an intersection table for a many-to-many association between a table of articles and a table of tags.

Press + to interact
CREATE TABLE ArticleTags (
id SERIAL PRIMARY KEY,
article_id BIGINT UNSIGNED NOT NULL,
tag_id BIGINT UNSIGNED NOT NULL,
FOREIGN KEY (article_id) REFERENCES Articles (id),
FOREIGN KEY (tag_id) REFERENCES Tags (id)
);

He was getting incorrect results from queries when counting the number of articles with a given tag. He knew that there was only one article with a particular tag, but the query was telling him that there were three.

Press + to interact
SELECT tag_id, COUNT(*) AS articles_per_tag FROM ArticleTags WHERE tag_id = 327;

When he queried all the rows matching that tag_id, he saw that the tag was associated with one particular article thrice; three rows showed the same association, although they had different values for id.

Press + to interact
SELECT * FROM ArticleTags;

After seeing the details, we came to know that ...