Coding Exercise on Masked Modeling

Coding exercise on masked image modeling.

We'll cover the following...

Problem statement

So far while implementing masked image modeling algorithms (SimMIM, MAEs, MSNs), we've used random patch-dropping as our masking strategy. In this exercise, we'll implement another masking strategy called focal masking. Focal masking selects a continuous local block of an image and masks everything around it. The figure below illustrates the difference between random and focal masking.

Press + to interact
Difference between random and focal masking
Difference between random and focal masking

The following code snippet implements the FocalMaskGenerator class, which takes input_size, mask_patch_size, and block_size as parameters in its __init__ call and defines a variable, self.rand_size, which denotes the number of patches along one dimension P1/2P^{1/2} (wherePP ...