AWS EMR
Learn about the Amazon EMR and the open-source big data processing frameworks that it supports
We'll cover the following...
Amazon EMR (previously called Elastic MapReduce) is a cloud-based service offered by Amazon Web Services (AWS) that runs big data frameworks like Hadoop, Apache Spark, HBase, and Presto on AWS for data processing, machine learning, and data analysis-related tasks. It’s a managed service, so it removes the complexity of managing the big data infrastructure, i.e., it scales processing power based on data volume, and we only pay a per-second rate for what we use. In this lesson, we will learn about the features of EMR and how it works.
Amazon EMR cluster
The core processing unit of the Amazon EMR cluster is the cluster. A cluster is a group of Amazon EC2 instances working together as a single compute resource, where each instance is called a node. These nodes can be categorized into different types depending on the roles they ...