...

/

ADF Studio Designing Data Pipelines 1: Data Copy

ADF Studio Designing Data Pipelines 1: Data Copy

Explore the data pipeline design using the Azure portal.

This lesson focuses on designing data pipelines using Azure Portal UI. We’ll delve into tools and functionalities like the intuitive drag-and-drop interface, interactive canvas, and comprehensive toolbox provided by the Azure Portal that enable seamless pipeline design.

Designing data pipelines in ADF

Designing data pipelines in ADF involves creating a workflow that defines how data moves from source systems to target systems. A pipeline consists of activities that represent a processing step, such as copying data from one location to another, transforming data, or running a custom activity. The pipeline also includes data flow activities that define the structure of the data as it moves through the pipeline.

This includes selecting the right data sources and destinations, choosing the appropriate data integration technologies, defining the data transformation logic, and optimizing pipeline performance. ADF provides a wide range of tools and features for designing data pipelines, including a visual designer, a code editor, and integration with other Azure services such as Azure Functions, Azure Databricks, and Azure Stream Analytics.

Data movement activities in ADF

Data movement activities in ADF are used to copy and transform data between different data sources and destinations. These activities enable the movement of data across various on-premises and cloud-based data stores, including SQL Server, Oracle, MySQL, PostgreSQL, Azure SQL Database, Azure Blob Storage, Azure Data Lake Storage, and more. Active loading activities are used to copy data in real-time or near real-time. It is useful for scenarios where data needs to be processed as soon as it is available, such as in streaming data scenarios.

Azure allows for incremental and bulk data copy using data factory and this documentation explains steps for setting that architecture up.

Types of data copy functionality

  • Copy data activity: Copy data activity is the primary data movement activity in Azure Data Factory. It is a serverless data copy solution that can copy data from various sources to various destinations. The Copy data activity can replicate data from file-based systems like Azure Blob Storage, Azure Data Lake Storage, FTP, and database-based systems like SQL Server, Oracle, MySQL, ...