Loading Image Dataset
Learn how to load and process image and tabular data.
We'll cover the following...
Loading image dataset
Let’s now see how we can load image data. We’ll use the Cats and Dogs images. We start by extracting the dataset from the zip file.
import zipfilewith zipfile.ZipFile('../train.zip', 'r') as zip_ref:zip_ref.extractall('.')
In the code above:
Line 1: We import the
zipfile
library.Lines 3–4: We call the
ZipFile()
method of thezipfile
module to open the zip file in read mode aszip_ref
. We use thewith
statement to automatically close the file after the code execution. We call theextractall()
method to extract the content of the zip file in the current directory.
Next, we create a pandas DataFrame containing the labels and paths to the images.
import pandas as pdbase_dir = 'train'filenames = os.listdir(base_dir)categories = []for filename in filenames:category = filename.split('.')[0]if category == 'dog':categories.append("dog")else:categories.append("cat")df = pd.DataFrame({'filename': filenames,'category': categories})print(df)
In the code above:
Line 1: We import the
pandas
library aspd
.Lines 2–3: We define the base directory
base-dir
that contains the images for training the model. We call thelistdir()
method of theos
module to get all file names present in thebase_dir
.Line 4: We define a list
categories
to store the category of each file. ...