What is Sklearn.datasets.load_iris(*[, return_X_y, as_frame])?

Scikit-Learn is a popular machine learning library in Python. It has some of the most fundamental algorithms used in supervised and unsupervised learning in machine learning.

To use Scikit-learn, we need to import the library abbreviated as sklearn, as shown below.

import sklearn

About the Iris dataset

The Iris dataset is one of the most popular datasets in data science. It is considered the ‘Hello World’ of machine learning and can be used to learn classification algorithms.

The Iris dataset consists of 3 types of Iris flowers and their characteristics and classifications.

The ‘scikit-learn’ package already comes with the Iris dataset preloaded.

Use the following steps to import the datasets package from sklearn. This gives us access to other datasets as well.

from sklearn import datasets
#this imports the package 'datasets' from sklearn

In order to import the iris data as a numpy array, set the return parameter to True.

from sklearn import datasets
iris_X,iris_y = datasets.load_iris(return_X_y = True)
#loads the dataset as a numpy array
#to view the Iris_X dataset array
print(iris_X)

To import the training data (X) as a dataframe and the training data (y) as a series, set the as_frame parameter to True.

from sklearn import datasets
iris_X,iris_y = datasets.load_iris(return_X_y = True , as_frame = True)
#the X,y data is converted to a dataframe and series respectively

The as_frame functionality is not available in sklearn version 0.22 and older, so in case you run into an error (such as ‘unspecified keyword argument’ as_frame), you can upgrade your sklearn library using this code:

  • !pip install scikit-learn == 0.24 on your jupyter notebook

or

  • pip install --upgrade scikit-learn in your Python terminal.

Free Resources