How to create a parallel coordinates plot in Julia

In Julia, parallel coordinates plots are commonly used for visualizing and analyzing multivariate numerical data. They are ideal for comparing multiple variables or features and analyzing their relationships.

A parallel coordinates plot is a visualization technique where a separate vertical axis represents each variable or dimension, and these axes are laid parallel. Depending on the unit of measurement for each variable, each axis can have a different scale, or the axes can all be uniformly normalized. Each data element is displayed as a series of connected points along these axes.

Importance of axis order

When working with the parallel coordinates plot, the order of the axes affects the reader's interpretation of the data and can help discover patterns or correlations. It is worth noting that rendering too many variables or features may result in a cluttered chart with a confusing appearance.

Let's look at the following example showing how to draw a parallel coordinates plot in Julia for multidimensional exploratory data analysis:

Example:

This example illustrates leveraging the PlotlyJSPlotlyJS is an interactive plotting library that supports more than 40 unique chart types that can be used for statistical, financial, geographic, scientific, and 3-dimensional data visualization. library to create a parallel coordinates plot.
Built on top of the Plotly JavaScript library, it allows the users of Julia's ecosystem to create attractive, insightful, and interactive web-based visualizations.

Click the "Run" button in the widget below to execute the following code.

using PlotlyJS, DataFrames
df = DataFrame(
product_id = [1, 2,3,4,5]
,product_name = ["Oven", "Microwave", "Dishwasher", "Refrigerator", "Toaster"]
,price = [800, 250, 700, 1400, 120]
,height = [200, 150, 230, 540 , 40]
,width = [350, 250, 180, 120 , 30]
);
mytrace = parcoords(;line = attr(color=df.product_id)
,dimensions = [
attr(range = [0,10000]
, label = "price"
, values = df.price),
attr(range = [0,1000]
, label = "height"
, values = df.height),
attr(range = [0,1000]
, label = "width"
, values = df.width)
]);
layout = Layout(title_text="Parallel Coordinates Plot"
, title_x=0.5
, title_y=0
)
myplot = plot(mytrace,layout)

Explanation:

Let's go through the code widget above to get a better understanding of this topic:

  • Line 1: Load the modules PlotlyJS.jl and DataFrames.jl.

  • Lines 3–9: Construct a sample DataFrame holding the data related to some product appliances. It includes several features or dimensions like price, height, and width.

  • Lines 11–22: Invoke the method parcoords to generate a parallel coordinates plot.
    For each product, we set a different line color based on its identifier:
    line = attr(color=df.product_id)
    Moreover, we specify the dimensions to be considered based on the features of the products, like price, height, and width. For each dimension, we specify the scale, a label to be assigned for the related axis, and the series of related values.

  • Lines 23–26: Define a layout for the generated plot. Within this layout, we specify a title and indicate its location using the parameters title_x and title_y.

  • Line 27: Invoke the function plot to generate a parallel coordinates plot.

Copyright ©2024 Educative, Inc. All rights reserved