What is Dimensional Modeling?

svg viewer

Overview

A data warehouse can be thought of as a storage facility/repository for current and historical data that can be used to report purposes or analysis. For example, in the business world, a data warehouse might incorporate customer information from a company’s point-of-sale systems (the cash registers), its website, its mailing lists, and its comment cards.

Dimensional Modeling is a data structure that facilitates the storage of data in data warehouse. Dimensional Modeling is the brain-child of Ralph Kimball and is based on facts and dimension.

Facts

Facts are the actual numbers and data of a process that push the quantitative value of the transaction forward.

Dimensions

Dimensions are a descriptive context behind the facts of a process.

For example, in a retail store, a business process could have the following facts and dimensions:

Facts Dimensions
Price Store
Quantity Product
Tax Transaction number

Facts are stored in the fact table and dimensions are stored in the dimension table.

Dimensional modeling

Dimensional modeling provides a blueprint for businesses to design and create a data table, which consequently ensures fast and efficient data management.

There are five steps you need to take in order to create a dimensional model:

svg viewer

Step 1: Select the business process (Why?)

Before delving into more intricate stuff, it is always better to identify the overlay of the problem. Similarly, before developing a dimensional model, we need to identify the business process we want to base our model on. For example, in a retail store, a business process could be shipment and storage, staff duties, point-of-sale, etc. Most fact tables focus on the results of a single business process.

Step 2: Declare the grain (How much?)

Declaring the grain essentially decides on the depth of our table, i.e., how much detail one single row of the table will contain. Specifying grain before anything else is important as it provides us with a structure of how much detail each entry of our table will capture. For example, in a storage facility’s business process, you can include all the items in the facility with their item numbers, or you can just include the type of a group of items.

Step 3: Identify the dimensions (When?, Where?, Who?)

As mentioned above, the dimensions include descriptive information around a business process to enable the filtering of the data. You have to provide the dimensions to answer the following three questions: When, Where, and Who. For example, for a retail store’s storage facility, we can answer the questions in the following manner:

  • When - When was the item shipped?
  • Where - Where is the item kept in the storage facility?
  • Who - Who is in charge of docking the items?

Step 4: Identify (What?)

As mentioned above, the facts are the quantitative data associated with a business process. A single fact table row has a one-to-one relationship to a measurement event (as described by the fact table’s grain). Thus, a fact table corresponds to an observable event. For example, for a storage facility, we can have the following facts:

  • Size of storage occupied
  • Size of storage left
  • The number of items to order

Step 5: Build the schema

After jotting down all the required design objectives until step 4, it is time to shape these objectives in a database structure. For that, there are two popular schemas:

  1. Star Schema: It goes by a star schema because the diagram resembles a star with points radiating from the center. The center of the star consists of the fact table, and the points of the star are dimension tables.

  2. Snowflake Schema: The snowflake schema is an extension of the star schema. In a snowflake schema, each dimension is normalized and connected to more dimension tables.

Free Resources

Copyright ©2024 Educative, Inc. All rights reserved