An Example of K-Means From the Wild: Tera Online
Learn about a dataset taken from TERA, its normalization, scaling, and k-means results.
To contextualize the use of the k-means method within game data science, we adopt an example of using k-means clustering to develop player profiles from
Dataset
The dataset from TERA is from the game’s open beta (character levels 1–32 only) and contains the following behavioral variables (or features in data mining terminology):
-
Quests completed
: This is the number of quests completed. -
Friends
: This is the number of friends in the game. -
Achievements
: This is the number of achievements earned. -
Skill levels
: This is the level in the mining and plants skills, respectively. -
Monster kills
: This is the number of AI-controlled enemies killed by the character (combining small, medium, and large monsters in one feature). -
Deaths by monsters
: This is the number of times AI-controlled enemies have killed the character. -
Total items looted
: This is the total number of items the character has picked up during the game. -
Auctions house use
: This is the combined number of times the character has either created an auction or purchased something from an auction. -
Character level
: This ranges from level 1 to 32. In this example, we’ll focus on level 32 players (if we just used all possible players, the cluster analysis would neatly give us clusters that are level-dependent, given how the values of the different variables change with character level, that is, a level 32 character will have completed, say 1000 quests, where a level 1 character will have completed 2).
Data preparation and analysis
Behavioral telemetry can suffer from quality problems. Incomplete records were removed, and various types of analyses were performed on the data to find any outliers and to check the distribution of the data for each feature.
Get hands-on with 1400+ tech skills courses.