What is affine transformation?

In this era, data is of great importance. Many companies spend their fortune on collecting suitable data. As only a handful of open source datasets are available. This can be targeted by artificially creating new data. When it comes to image data, we can perform different transformations. This process of artificially generating new data is called data augmentation.

The combination of linear transformations is called an affine transformation. By linear transformation, we mean that lines will be mapped to new lines preserving their parallelism, and pixels will be mapped to new pixels without disrupting the distance ratio. Affine transformation is also used in satellite image processing, data augmentation for images, and so on.

These transformations are performed by different matrices multiplication with a matrix MM. Different transformations require different kernel matrices that give respective transformations when multiplied by the image matrix. The affine transformation consists of the following transformations:

  • Scaling
  • Translation
  • Shear
  • Rotation

Note: A combination of these transformations is also an affine transformation.

Mathematical background

As mentioned earlier, matrix multiplication and addition play a big role in affine transformation. We first take a point XX with x and yx \text{ and } yfrom the image and represent it as a vector but with a three dimension set to 1. It is important to include this third dimension because otherwise, the transformation would not be linear.

Now, if we want to transform this point XX into XX', we multiply X with a matrix MM.

Scaling

When we scale an image, we shrink or expand it. The MM matrix for scaling is as follows:

Here sx s_x and sy s_y are the parameters for scaling in x x and y y axis, respectively.

Now we'll use tf.keras.preprocessing.image.apply_affine_transform to apply the transformations. We can find the details about the parameters in the official documentation.

Code implementation

Let's look at the code below:

image = cv2.imread("Detective.png")
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
transformation = tf.keras.preprocessing.image.apply_affine_transform(
image, # input image
zx=0.5,
zy=0.5,
row_axis=0,
col_axis=1,
channel_axis=2
)

  • Line 2: We use the OpenCV library to read the input image, so the image read would be in BGR, which we need to convert into RGB.
  • Lines 6 and 7: We have zx and zy as the scaling parameters.
  • Lines 8 to 10: We tell the order of RGB channels. We may not need this if we use some other library to read the image.

The output of scaling is as follows:

The scaled image output
The scaled image output

Translation

Translation means to pick up the image and place it in a new dimension. The kernel for translation is as follows:

Here tx t_x and tyt_y are the parameters for translation in x x and y y axis, respectively.

Code implementation

Let's look at the code below:

transformation = tf.keras.preprocessing.image.apply_affine_transform(
image,
tx=-400,
ty=400,
row_axis=0,
col_axis=1,
channel_axis=2
)
  • Lines 3 and 4: We have tx and ty as the translation parameters.

The output of the translation transform is as follows:

The translated image
The translated image

Shear

In this transformation, the image is slanted in the x or y direction.

The kernel for horizontal shear is:

And the kernel for vertical shear is:

Here shs_h and svs_v are the parameters for horizontal and vertical shearing, respectively.

Code implementation

Let's look at the code below:

transformation = tf.keras.preprocessing.image.apply_affine_transform(
image,
shear=30,
row_axis=0,
col_axis=1,
channel_axis=2
)
  • Line 3: In the implementation provided by tf.keras.preprocessing.image.apply_affine_transform , we provide an angle for shearing.

The output of the shearing transform is as follows:

The shearing transform output
The shearing transform output

Rotation

In rotation, we rotate the image in θ \theta direction. The kernel for this transformation is as follows:

Code implementation

Let's look at the code below:

transformation = tf.keras.preprocessing.image.apply_affine_transform(
image,
theta=90,
row_axis=0,
col_axis=1,
channel_axis=2
)
  • Line 3: We want to rotate our image in the theta angle.

The output of the rotation transform is as follows:

The output of the rotation transform
The output of the rotation transform

Copyright ©2024 Educative, Inc. All rights reserved