...

/

Lossy Unique Count with hll

Lossy Unique Count with hll

Learn how to compute precise counts.

We'll cover the following...

We can rewrite the query using our hll data type now, even though, at this stage, it’s not going to be very useful because we still have the full logs of every visit, and we can afford to compute precise counts.

Computing precise counts

Nonetheless, our goal is to dispose of the daily entries that we anticipate will be just too large a dataset. So, the hll-based query looks like this:

Press + to interact
select messageid,
datetime::date as date,
# hll_add_agg(hll_hash_text(ipaddr::text)) as hll
from tweet.visitor
where messageid = 3
group by grouping sets((messageid),
(messageid, date))
order by messageid, date nulls first
limit 10;

In this query, we use several new functions and operators related to the hll data type:

  • The # operator takes a single argument: it’s a unary operator, like factorial (written ...