Modeling Example
Learn about schema and MVP through an example.
We'll cover the following...
Now that we have some data to play with, we can test some application queries for known user stories in the MVP, such as listing the most recent articles per category with the first three comments on each article.
Playing with schema
That’s when we realize our previous schema design misses publication timestamps for articles and comments. We need to add this information to our draft model. As it’s all a draft with random data, the easiest way around this—you already committed the data previously (by editing the script)—is to simply drop schema cascade
as shown here:
drop schema sandbox cascade;
The next version of our schema then looks like this:
begin;create schema if not exists sandbox;create table sandbox.category(id serial primary key,name text not null);insert into sandbox.category(name)values ('sport'),('news'),('box office'),('music');create table sandbox.article(id bigserial primary key,category integer references sandbox.category(id),pubdate timestamptz,title text not null,content text);create table sandbox.comment(id bigserial primary key,article integer references sandbox.article(id),pubdate timestamptz,content text);insert into sandbox.article(category, title, pubdate, content)select random(1, 4) as category,initcap(sandbox.lorem(5)) as title,random( now() - interval '3 months',now() + interval '1 months') as pubdate,sandbox.lorem(100) as contentfrom generate_series(1, 1000) as t(x);insert into sandbox.comment(article, pubdate, content)select random(1, 1000) as article,random( now() - interval '3 months',now() + interval '1 months') as pubdate,sandbox.lorem(150) as contentfrom generate_series(1, 50000) as t(x);select article.id, category.name, titlefrom sandbox.articlejoin sandbox.categoryon category.id = article.categorylimit 3;select count(*),avg(length(title))::int as avg_title_length,avg(length(content))::int as avg_content_lengthfrom sandbox.article;select article.id, article.title, count(*)from sandbox.articlejoin sandbox.commenton article.id = comment.articlegroup by article.idorder by count desclimit 5;select category.name,count(distinct article.id) as articles,count(*) as commentsfrom sandbox.categoryleft join sandbox.article on article.category = category.idleft join sandbox.comment on comment.article = article.idgroup by category.nameorder by category.name;commit;
Random timestamp entries
To be able to generate random timestamp entries, the script uses another function that’s not provided by default in PostgreSQL, and here’s its definition:
create or replace function random(a timestamptz,b timestamptz)returns timestamptzvolatilelanguage sqlas $$select a+ random(0, extract(epoch from (b-a))::int)* interval '1 sec';$$;
Now we can have a go at solving the first query of the product’s MVP, as specified before, on this schema draft version. That should provide a taste of the schema and how well it implements the business rules.
Enlist the data
The following query lists the most recent articles per category with the first three comments on each article:
\set comments 3\set articles 1select category.name as category,article.pubdate,title,jsonb_pretty(comments) as commentsfrom sandbox.category/** Classic implementation of a Top-N query* to fetch 3 most articles per category*/left join lateral(select id,title,article.pubdate,jsonb_agg(comment) as commentsfrom sandbox.article/** Classic implementation of a Top-N query* to fetch 3 most recent comments per article*/left join lateral(select comment.pubdate,substring(comment.content from 1 for 25) || '…'as contentfrom sandbox.commentwhere comment.article = article.idorder by comment.pubdate desclimit :comments)as commenton true -- required with a lateral joinwhere category = category.idgroup by article.idorder by article.pubdate desclimit :articles)as articleon true -- required with a lateral joinorder by category.name, article.pubdate desc;
The ...