Defining a tf.data.Dataset
Learn to create the TensorFlow data pipeline.
Helper functions
Now, let’s look at how we can create a tf.data.Dataset
using the data. We’ll first write a few helper functions. Namely, we’ll define:
parse_image()
to load and process an image from afilepath
.generate_tokenizer()
to generate a tokenizer trained on the data passed to the function.
The parse_image()
function
First, let’s discuss the parse_image()
function. It takes three arguments:
filepath
: Location of the imageresize_height
: Height to resize the image toresize_width
: Width to resize the image to
The function is defined as follows:
def parse_image(filepath, resize_height, resize_width):""" Reading an image from a given filepath """# Reading the imageimage = tf.io.read_file(filepath)# Decode the JPEG and make sure there are three channels in the outputimage = tf.io.decode_jpeg(image, channels=3)image = tf.image.convert_image_dtype(image, tf.float32)# Resize the image to 224x224image = tf.image.resize(image, [resize_height, resize_width])# Bring pixel values to [-1, 1]image = image*2.0 - 1.0return image
Read image from the path of the file
We are mostly relying on tf.image
functions to load and process the image. This function specifically:
Reads the image from the ...