Spark and Big Data
Dig deeper into the Spark data processing model and its architecture.
We'll cover the following...
Big data primer
Before we describe the processing model that Spark fits into in both the context of this course and big data, it’s important to explain what big data means.
The term big data fundamentally refers to various technologies aligned with different strategies on how to process large datasets of information.
The word “large” has traditionally and implicitly included the notion that whatever dataset is being processed, it packs an amount of information that realistically cannot be processed by a single resource, such as a lone server or computer. Because available processing power and business needs are constantly changing, the word also includes the notion that the exact size of a dataset is not estimated to a specific figure.
As vague as it might seem, “big” is an appropriate word to refer to datasets that are undefined by the limits of their size while representing vast volumes of information. So, big data solutions aim to solve the problem that conventional methods face while working with them. ...