...

/

Incremental Model with the MERGE Strategy

Incremental Model with the MERGE Strategy

Learn how to create an incremental model with the MERGE strategy.

Defining an incremental model

Tables come with one big issue: they don’t stay updated. To refresh the table, we need to run the dbt model. Every time we run a dbt model, it recreates the whole table. When dealing with large amounts of data, it can become very slow and ineffective.

Fortunately, there’s a third type of materialization that incrementally merges new rows with the existing table through an incremental strategy. This materialization requires that we enable billing on our BigQuery project (it can be done through the free trial).

To use incremental models with BigQuery, you’ll need to enable billing on your project.

You’ll then get 300€ of credits to use before being charged.

Press + to interact
{{ config(
materialized="incremental"
)
}}
SELECT *
FROM {{ ref("orders") }}
LEFT JOIN {{ ref("addresses") }}
USING(address_id)
{% if is_incremental() %}
WHERE order_id > (SELECT MAX(order_id) FROM {{this}})
{% endif %}

We specify an if statement to filter the model and select only the rows that we want to add. Here, we specify that we want to add only the rows that have an order_id higher than what we already have in the database. Here’s what happens in the background when we run this model:

Press + to interact
merge into `project-id`.`dataset-id`.`shipping_tags` as DBT_INTERNAL_DEST
using (
SELECT *
FROM `project-id`.`dataset-id`.`orders`
LEFT JOIN `project-id`.`dataset-id`.`addresses`
USING(address_id)
WHERE order_id > (SELECT MAX(order_id) FROM `project-id`.`dataset-id`.`shipping_tags`)
) as DBT_INTERNAL_SOURCE
on FALSE
when not matched then insert
(`address_id`, `order_id`, `gift_message`, `address_first_line`, `address_second_line`, `zip_code`, `city`, `country`)
values
(`address_id`, `order_id`, `gift_message`, `address_first_line`, `address_second_line`, `zip_code`, `city`, `country`)

The MERGE statement in BigQuery

In BigQuery, a MERGE statement is used to update a table based on the content of another table. First, it starts by declaring a target and a ...