Unified Governance

Learn about data governance, data-related regulations and considerations, and AWS Lake Formation.

Broadly speaking, data governance describes how organizations manage their available data, including data from their customers and users. It involves concepts of data quality, integrity, and security.

In this lesson, we consider this topic broadly and then dive into AWS tools for unified governance.

Finding a balance

Finding a good balance for data governance policies is similar to how we’ve had to do so for computer and physical security. On the one hand, we want a lot of freedom in what we can access and do; on the other hand, we probably need to set up some precautionary measures to avoid big issues. Security in a small, peaceful town differs from that in a larger metropolis where car break-ins and other crimes occur more frequently.

Here are two illustrative examples of security:

  • No one greatly enjoys standing in line to go through airport security and then getting checked by various humans and machines. However, we’ve more or less accepted this situation (except perhaps when we’re running late for our flight).

  • Requirements for passwords and multifactor authentication can also be deterrents to legitimate users. We’ve had to call AWS customer support because we migrated phones and no longer had the multifactor authentication app necessary to access our AWS account.

With regard to data quality and integrity, the 80/20 rule (or Pareto principle) comes to mind. Issues related to data quality and integrity have similarities to bugs in software. In developing versions of Windows and Office, Microsoft saw that 20% of the bugs led to 80% of all errors seen by users, and 1% of bugs led to 50% of all errors seen by users.

Improving data quality and integrity ...