Allocating a Matrix as a Contiguous Block

Learn to allocate a two-dimensional array on the heap as a contiguous block.

Introduction

Allocating matrices as an array of pointers to arrays is one way to dynamically allocate them. However, it has a few drawbacks.

Code complexity

The code is pretty straightforward after we understand what’s going on, but it contains quite a few steps. We also have to pay close attention to the memory deallocation part and how to treat allocation failure cases. These things can be tricky if we extend this approach to higher dimensional arrays.

Compare a straightforward allocation of a one-dimensional array:

int* arr = malloc(n * sizeof(int))

With the complex allocation of a two-dimensional array as an array of pointers to arrays. We already saw this code in the previous lesson:

int** allocateMatrix(int n, int m)
{
	int** matrix = malloc(n * sizeof(int*)); 
	if(matrix == NULL)
	{
		return NULL;
	}

	for(int i = 0; i < m; i++)
	{
		matrix[i] = malloc(m * sizeof(int)); 
		if(matrix[i] == NULL)
		{
			for(int j = 0; j < i; j++)
			{			 
				free(matrix[j]);
			}

			free(matrix);
			return NULL;
		}
	}

	return matrix;
}

It’s much more code to remember to get it right! It’ll get even more complicated for higher-dimension arrays.

This allocation method is also less efficient from a temporal complexity standpoint because it involves multiple for loops.

We should find another approach—one that’s easier to scale to higher dimensional arrays.

Cache locality

The following point is more complex. We’ll briefly describe the issue. Try to understand it at an intuitive level.

Recall that we discussed RAM, the memory where most of the data is stored. We also mentioned CPU registers, small memory locations placed directly in the CPU for fast memory access.

There are more types of memory, and one of them is the cache. The cache is a small size memory area, usually layered, on the CPU or very close to the CPU. The cache usually uses a faster memory technology than the one used for the RAM. We can do this because the cache is small, so it doesn’t cost a lot to produce. However, it would cost very much to make the entire RAM from the same memory technology as the cache.

The cache is usually in the order of kilobytes and megabytes, while the RAM is in the order of gigabytes.

To understand why the cache is needed, look at the flow of working with memory:

  • The CPU must request to read or write a memory block.
  • The RAM controller must search inside the RAM to find the correct address.
  • This takes a lot of time, as the memory is several orders of magnitude slower than the CPU. Memory accesses are slow, and they slow down our code, as the CPU has to wait for the memory operation and can’t work.

The cache aims to solve the problem of slow memory accesses, by keeping important data closer to the CPU. When the CPU makes a memory request, that address is first searched inside the cache.

  • If found, it’s called a cache hit. The data gets brought to the CPU, and the operation is fast.

  • If not found, it’s called a cache miss. The data is brought from the RAM to the CPU but also saved inside the cache for faster access in the future.

Get hands-on with 1300+ tech skills courses.