The term data mining originated in the early 1990s and became popular years after the advent of big data. Several industries use the
We are drowning in data but starving for knowledge.
Because of
The tools provided for processing data and extracting information have been deemed helpful in a wide variety of fields which include and are not limited to:
The business industry can use the classification and characterization that many algorithms provide to make critical business decisions that would otherwise take a substantial amount of time.
Stores and shopping centers can use frequent pattern mining for targeted marketing and direct mail targeting. Similarly, broker houses can identify the behavior of the stock market. Subscription-based services also use pattern mining to identify the customers most likely to leave their services and those willing to upgrade their subscription tier.
Market basket analysis identifies the items frequently bought in a single transaction, enabling stores and shopping centers to provide bulk discounts on products, ultimately adding to the number of customers.
The internet is full of web pages. A web search engine scours the internet for the most relevant pages required by the user. These engines operate on algorithms that identify the relevance of a page by how many "hits" it has had. A data mining technique called the link analysis
Correlation and covariance between the attributes of a multidimensional dataset can give us the relation of one feature to another. As a result, this can be applied to crime reports. A heatmap gives us the visual representation of the
Time analysis of the data can give us an understanding of when a particular crime is committed, resulting in reduced crime rates. Below is some helpful information derived from a police station in Los Angeles crime report dataset.
Similarly, spatial analysis can help narrow down areas where a particular crime is likely to occur, as shown in the following illustration.
In the context of the same dataset, based on crime data, researchers can use clustering techniques to identify the group of individuals (based on their age or ethnicity) towards which certain crimes are directed for statistical purposes.
Simulations for science experiments are being modeled on computers without physically setting up the experiment's requirements, which might be too difficult to do in some cases. Additionally, complex calculations such as weather forecasting and other computationally expensive tasks produce data in huge bulks, which can be warehoused to be pre-processed later for the visualization of insightful data.
A suitable example would be that the discovery of an actual black hole. The science experiment led by Dr. Shepard Doleman used a planet-wide telescope to collect 5,000 trillion bytes of data from all over space and used advanced algorithms and supercomputers to aid the researchers in finding the pictures of the black hole.
Some noteworthy data mining applications include:
Free Resources