AWS DataSync

Discover how AWS DataSync can be used to migrate data between file and object storage systems.

AWS DataSync is a tool to migrate data between file storage systems and AWS storage services such as Amazon S3, EFS, and Snowcone devices. It was initially launched in 2018 and has since been expanded to support the migration of cloud-based object storage systems from Google Cloud Storage and Azure Files.

Supported non-AWS storage

AWS DataSync supports migrating files between these non-AWS storage mechanisms with the use of a DataSync agent:

  • NFS: It’s a protocol with roots in the Unix operating system, which is also supported on Windows Servers and Mac. File shares start with nfs://.

  • SMB: It’s a protocol implemented by Windows, Azure Files, and Mac. File shares start with smb://.

  • HDFS: It’s a system within the open-source Apache Hadoop software framework. File shares start with hdfs://.

  • Object Storage: It’s a system that’s supported by Google Cloud Storage and others. File shares are authenticated by a Hash-based Message Authentication Code (HMAC) key.

In most cases, those who use the above file storage mechanisms with AWS DataSync already have data on physical devices that support one or more of these protocols. For example, organizations’ IT departments might have on-premise infrastructure that uses NFS, SMB, or HDFS to share files.

Supported AWS storage

AWS DataSync supports the transfer of data between these AWS storage services:

  • S3: Launched in 2006, S3 is one of the most popular AWS services and is considered a scalable and cost-effective way to store a variety of objects. Amazon’s Data Lake on AWS architecture recommends S3 as the centralized location to store data of all formats.

  • EFS: Launched in 2016, EFS is a cloud storage service that supports the NFS ...