How to use the matplotlibs pyplot.scatter() function in Python

Overview

The pyplot.scatter() function in matplotlib is used to create a scatter plot. For further understanding, the pyplot module has a function called scatter(), among many other functions, which helps to create or draw a scatter plot.

A scatter plot is a type of plot that uses a dot to graphically represent two variables for a set of data. The scatter plot is used to show a relationship between two numeric variables. The dots in a scatter plot also represent a pattern from which certain data analysis conclusions can be made.

Creating a scatter plot

As mentioned earlier, we make use of the pyplot.scatter() function in matplotlib, which creates a scatter plot. The pyplot.scatter() function must contain two (x and y-axis) arrays of the same length.

Code

import matplotlib.pyplot as plt
import numpy as np
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y = np.array([10, 50, 30, 40, 50, 25, 70, 15, 90, 65])
plt.scatter(x, y)
plt.show()

Output

The output of the code

Explanation

  • Line 1: In the matplotlib we imported the pyplot module, which will help us to create plots.
  • Line 2: We imported the numpy module, which helps us create arrays.
  • Line 4 and 5: We created the arrays, x and y for both the x-axis and y-axis. They are both of equal length.
  • Line 7: Here, we used the pyplot.scatter() method to create a scatter plot of x and y.
  • Line 8: Using the pyplot.show() function, we told pyplot to show us the plot.

Multiple scatter plots

Interestingly, we can create multiple scatter plots on the same figure. This is usually done by data analysts to compare different data variables.

Code

import matplotlib.pyplot as plt
import numpy as np
# the first scatter plot
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y = np.array([10, 50, 30, 40, 50, 25, 70, 15, 90, 65])
plt.scatter(x, y)
# the second scatter plot
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y1 = np.array([5, 35, 40, 45, 80, 20, 95, 55, 70, 10])
plt.scatter(x,y1)
plt.show()

Output

The two colors present in the plot are that of the first and second scatter plots. By default, pyplot returned blue and orange colors.

Free Resources