Big data is a collection of larger data sets that are higher in volume. It grows exponentially over time which makes the storage and retrieval of the data through traditional data tools extremely difficult.
Examples of big data include customer databases, emails, mobile applications, and social networking applications.
V's of big data allow people to derive more valuable information from the data, thereby allowing organizations to become more client-focused. In 2001, big data was initially categorized into variety, velocity, and volume. As the understanding of big data advanced, two more dimensions: veracity, and value, were added, expanding it to five V's. Subsequently, other variants, like eight and ten emerged, reflecting the ongoing development and complexity of the big data field.
We will now be discussing the five V's of big data.
Volume refers to the size or amount of the data. A rapid increase in the volume of big data is due to cloud computing, IoT devices, and mobile traffic.
An example is Shopify, an eCommerce platform generating millions of data points from daily active users worldwide.
Velocity refers to the speed at which the data is collected or accumulated. A prominent example of this is Google, which supports more than 8.5 million searches a day.
Variety mainly refers to the diversification of the data types. Data types can be classified into three categories:
Structured data
Structured data includes the data in an organized form. Examples include the transaction data or the bank statements showing the exact amount, time and date, etc.
Semi-structured data
It includes the data which does not conform to the formal structure. Examples include markup languages: XML, zipped files and the data gathered from different web pages or sources.
Unstructured data
It refers to an unorganized data which does not fit into the typical relational database structure mainly rows and columns. Examples include document collection, text files, images, audio, and video files.
A good example of variety is social media platforms that collects diverse data types such as pictures, videos, texts, locations, etc.
Veracity refers to how accurate and valuable the data is. It includes the assurance of accuracy, credibility, integrity, and quality. Incorrect data may lead to targeting the wrong customers and misleading communications, which ultimately causes a loss to businesses.
Financial institutions such as Central banks and Internet banks analyze the data from various sources such as customer information, market feeds, and transactions to ensure the accuracy of the data for better decision-making processes.
Value refers to a measure of how much the data is helpful in the decision-making process. A retail company that uses proper analytics to gain meaningful insights into customer purchasing patterns to enable personalized marketing campaigns and improved customer experience is an example of the value of big data.
Other V’s include:
Variability (inconsistency in the data).
Validity (how correct the data is for different purposes).
Viability (predicting the most relevant outcomes).
Viscosity (degree of correlation).
Volatility (rate of change of the data).
Test your understanding.
Which of the following characteristics of big data refers to the usefulness of the data for achieving relevant outcomes?
Variety
Viability
Variability
Validity
Value
Free Resources