Data Clustering
Build on the fundamentals of data mining for TSP by learning clustering techniques.
We'll cover the following...
Data visualizations are used to extract knowledge and insights from data sets. In addition to just displaying the location, the coordinates on a map can display additional key figures. For example, the location marker color can display the level of sales. Even time series diagrams can be plotted over the location. Often, it is only when looking at the
Data mining refers to the process of discovering patterns and extracting useful information from (large) datasets. Clustering is one of the techniques used in data mining and involves grouping similar data points together into clusters based on their similarities or dissimilarities. The goal of clustering is to identify groups of data points that are similar to each other and different from data points in other clusters. This can help in identifying patterns in the data that can be useful in a variety of applications, such as
Data mining
Our boss asked us to figure out the shortest total distance between stores. But as proactive data scientists, we want to deliver more than what was asked of us. After all, it’s our job to remember that the appeal is in combining different data. Given our strong connection with the sales department, obtaining the daily sales data of the stores is easily manageable.
Sample Extract
SalesDate | SalesValue | Store |
31.01.2020 | 39.0 | 1 |
31.01.2020 | 2560.0 | 2 |
31.01.2020 | 4476.0 | 3 |
Note: The dataset we have contains stores with numbers. Store 1 corresponds to StoreA, Store 2 corresponds to StoreB, and so on. This mismatch often occurs in reality when we merge master data from different systems.
RFM clustering
Recency, Frequency, and Monetary value (RFM) clustering is an effective customer segmentation technique. It can help our sales colleagues to make better strategic decisions because we can quickly ...