...

/

Modeling Example

Modeling Example

Learn about schema and MVP through an example.

Now that we have some data to play with, we can test some application queries for known user stories in the MVP, such as listing the most recent articles per category with the first three comments on each article.

Playing with schema

That’s when we realize our previous schema design misses publication timestamps for articles and comments. We need to add this information to our draft model. As it’s all a draft with random data, the easiest way around this—you already committed the data previously (by editing the script)—is to simply drop schema cascade as shown here:

Press + to interact
drop schema sandbox cascade;

The next version of our schema then looks like this:

Press + to interact
begin;
create schema if not exists sandbox;
create table sandbox.category
(
id serial primary key,
name text not null
);
insert into sandbox.category(name)
values ('sport'),('news'),('box office'),('music');
create table sandbox.article
(
id bigserial primary key,
category integer references sandbox.category(id),
pubdate timestamptz,
title text not null,
content text
);
create table sandbox.comment
(
id bigserial primary key,
article integer references sandbox.article(id),
pubdate timestamptz,
content text
);
insert into sandbox.article(category, title, pubdate, content)
select random(1, 4) as category,
initcap(sandbox.lorem(5)) as title,
random( now() - interval '3 months',
now() + interval '1 months') as pubdate,
sandbox.lorem(100) as content
from generate_series(1, 1000) as t(x);
insert into sandbox.comment(article, pubdate, content)
select random(1, 1000) as article,
random( now() - interval '3 months',
now() + interval '1 months') as pubdate,
sandbox.lorem(150) as content
from generate_series(1, 50000) as t(x);
select article.id, category.name, title
from sandbox.article
join sandbox.category
on category.id = article.category
limit 3;
select count(*),
avg(length(title))::int as avg_title_length,
avg(length(content))::int as avg_content_length
from sandbox.article;
select article.id, article.title, count(*)
from sandbox.article
join sandbox.comment
on article.id = comment.article
group by article.id
order by count desc
limit 5;
select category.name,
count(distinct article.id) as articles,
count(*) as comments
from sandbox.category
left join sandbox.article on article.category = category.id
left join sandbox.comment on comment.article = article.id
group by category.name
order by category.name;
commit;

Random timestamp entries

To be able to generate random timestamp entries, the script uses another function that’s not provided by default in PostgreSQL, and here’s its definition:

Press + to interact
create or replace function random
(
a timestamptz,
b timestamptz
)
returns timestamptz
volatile
language sql
as $$
select a
+ random(0, extract(epoch from (b-a))::int)
* interval '1 sec';
$$;

Now we can have a go at solving the first query of the product’s MVP, as specified before, on this schema draft version. That should provide a taste of the schema and how well it implements the business rules.

Enlist the data

The following query lists the most recent articles per category with the first three comments on each article:

Press + to interact
\set comments 3
\set articles 1
select category.name as category,
article.pubdate,
title,
jsonb_pretty(comments) as comments
from sandbox.category
/*
* Classic implementation of a Top-N query
* to fetch 3 most articles per category
*/
left join lateral
(
select id,
title,
article.pubdate,
jsonb_agg(comment) as comments
from sandbox.article
/*
* Classic implementation of a Top-N query
* to fetch 3 most recent comments per article
*/
left join lateral
(
select comment.pubdate,
substring(comment.content from 1 for 25) || '…'
as content
from sandbox.comment
where comment.article = article.id
order by comment.pubdate desc
limit :comments
)
as comment
on true -- required with a lateral join
where category = category.id
group by article.id
order by article.pubdate desc
limit :articles
)
as article
on true -- required with a lateral join
order by category.name, article.pubdate desc;

The ...