Pointer Arithmetic with Arrays
Expand your knowledge of pointer arithmetic.
Array memory drawing
We found out that the array’s name holds the same address as the address of the first element of the array.
Therefore, arr
is a pointer to the first element inside the array.
Assume the following array:
int arr[3];
Let’s draw the array, but this time instead of a stack view, let’s use the byte-level view.
We have an array that occupies 3 * sizeof(int)
= 3 * 4 = 12 bytes.
Pointer arithmetic with arrays
Remember that arr
is a pointer to the first element inside the array.
Let’s try to add 1
to arr
. Since it’s a pointer to the first element inside the array (int
), it’s the same as adding 1
to an int *
. The operation will add 1 * sizeof(int) = 4
to the address. Then, we skip the next 4 bytes and end up right at the start of arr[1]
.
Similarly, arr + 2
will add 2 * sizeof(int) = 8
to the address, skipping the next 8 bytes and landing exactly at the start of arr[2]
.
Therefore, arr
is the address of the first element, arr + 1
is the address of the second element, and arr + 2
is the address of the third element.
Let’s confirm this in code by comparing the addresses of arr + i
and &(arr[i])
. Remember the following:
arr
is a pointer.arr[i]
is an element. We must use the address-of&
operator to obtain a pointer from it.
#include <stdio.h>int main(){int arr[3] = {1, 2, 3};printf("arr + 0 = %u | &arr[0] = %u\n", arr + 0, &(arr[0]));printf("arr + 1 = %u | &arr[1] = %u\n", arr + 1, &(arr[1]));printf("arr + 2 = %u | &arr[2] = %u\n", arr + 2, &(arr[2]));return 0;}
A possible output is as follows:
arr + 0 = 1493920596 | &(arr[0] = 1493920596
arr + 1 = 1493920600 | &(arr[1] = 1493920600
arr + 2 = 1493920604 | &(arr[2] = 1493920604
We see that the addresses are equal, which means that arr + i = &arr[i]
.
Note that &(arr[i]) = &arr[i]
. Since []
has higher priority than &
, then ()
is unneeded.
We can also dereference arr + i
to read the i
th element inside the array.
#include <stdio.h>int main(){int arr[3] = {1, 2, 3};printf("arr + 0 = %d | &arr[0] = %d\n", *(arr + 0), arr[0]);printf("arr + 1 = %d | &arr[1] = %d\n", *(arr + 1), arr[1]);printf("arr + 2 = %d | &arr[2] = %d\n", *(arr + 2), arr[2]);return 0;}
The output is:
arr + 0 = 1 | &arr[0] = 1
arr + 1 = 2 | &arr[1] = 2
arr + 2 = 3 | &arr[2] = 3
We see that the values are equal, which means that *(arr + i) = arr[i]
.
Pointer and array notation equivalence
We found that:
arr + i = &arr[i]
*(arr + i) = arr[i]
Interestingly, this is how the compiler implements arr[i]
.
Note that arr[i]
means get i
th element from the array, which means that the compiler has to move i * sizeof(type)
bytes
to the right from the starting address (see the memory drawing).
We end up with *(arr + i)
, which reads as “read the integer starting at address arr + i
.”
When we do &arr[i]
it gets translated to &(*(arr + i))
. Since *
and &
cancel each other, we get arr + i
. So, &arr[i] = arr + i
.
Dereference Equivalence Table
Array Notation | Pointer Notation |
arr[0] | *(arr + 0) |
arr[1] | *(arr + 1) |
... | ... |
arr[i] | *(arr + i) |
... | ... |
We can create the same table for the address of operator:
Address-Of Equivalence Table
Array Notation | Pointer Notation |
&arr[0] | arr + 0 = arr |
&arr[1] | arr + 1 |
... | ... |
&arr[i] | arr + i |
... | ... |
Pointer arithmetic for arr
and &arr
Recall that arr
and &arr
hold the same memory address but have different types.
Since a pointer’s address changes by the size of the data type in pointer arithmetic, the different types yield different results.
Consider the same example array:
int arr[3];
Since arr
is a pointer to int, arr++
will translate to arr + sizeof(int) = arr + 4
.
But, &arr
has the type of pointer to an array of 3 integers. (&arr)++
will translate to arr + sizeof(int[3]) = arr + 4 * 3 = arr + 12
.
We can see this in the following code:
#include <stdio.h>int main(){int arr[3];printf("arr = %u\n", arr);printf("&arr = %u\n", &arr);printf("arr + 1 = %u\n", (arr + 1));printf("&arr + 1 = %u\n", ((&arr) + 1));return 0;}
Here’s a possible output:
arr = 2902473476
&arr = 2902473476
arr + 1 = 2902473480
&arr + 1 = 2902473488
The difference between arr
and arr + 1
is 4 bytes, the size of an integer.
The difference between &arr
and &arr + 1
is 12 bytes, the size of an array of three integers.
Let’s take incrementing these pointers one step further and consider the following example. We create two arrays and we are interested in exploiting their stack placement with pointer arithmetic.
#include <stdio.h>int main(){int arr1[3] = {1, 2, 3};int arr2[3] = {4, 5, 6};printf("&arr1 = %u | &arr2 = %u\n", &arr1, &arr2);//a pointer to arr2, has the type of int(*)[3]int(*pArr2)[3] = &arr2;printf("[Before increment] pArr2 = %u\n", pArr2);pArr2++;printf("[After increment] pArr2 = %u\n", pArr2);//a pointer to the first element inside arr2, has the type of int*int *pFirstElemArr2 = arr2;printf("[Before increment] pFirstElemArr2 = %u\n", pFirstElemArr2);pFirstElemArr2++;printf("[After increment] pFirstElemArr2 = %u\n", pFirstElemArr2);return 0;}
If we look at the printf
on line 8, we can conclude that the compiler placed arr2
first on the stack, then immediately after it is arr1
.
Here’s a possible output:
&arr1 = 4257141892 | &arr2 = 4257141880
The arr2
array has the lower address (4257141880
), and the distance between them is 12
bytes (the size of arr2
). It means there are no gaps between arr2
and arr1
on the stack.
On line 11, we use &arr2
to create a pointer to the array. It has the type of pointer to an array of three integers.
We then increment the pointer and see the results.
On line 17, we use arr
to create a pointer to the first element inside the array. It has the type of pointer to an integer.
We then increment the pointer and see the results.
Here’s a possible output:
[Before increment] pArr2 = 4257141880
[After increment] pArr2 = 4257141892
[Before increment] pFirstElemArr2 = 4257141880
[After increment] pFirstElemArr2 = 4257141884
Notice the results:
pArr2
is increased by the array’s size (12 bytes).pFirstElemArr2
increased by the size of the first element of the array (4 bytes).
After incrementing, pArr2
now holds the same address as arr1
(4257141892
), and therefore pArr2
now points to arr1
.
The following drawing shows the pointers before and after incrementation:
Note: Now is a good time to remember that variable placement on the stack is up to the compiler. Different compiler versions may produce different results. It just so happens that using this particular compiler,
arr2
is placed beforearr1
.
Let’s practice
Let’s write code to print the values inside an array using the array and pointer notations.
First, let’s do it classically. Iterate over the array and print arr[i]
.
#include <stdio.h>int main(){int arr[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};for (int i = 0; i < 10; i++){printf("arr[%d] = %d\n", i, arr[i]);}return 0;}
Next, let’s use pointers.
The code will be the same, except that instead of arr[i]
, we’ll exploit the following facts:
arr
is the base address of the array.arr + i
will give an address that isi
* 4 bytes to the right, or exactly the start address of thei
th element.*(arr + i)
will dereference that pointer and give us the value of thei
th element.
#include <stdio.h>int main(){int arr[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};for (int i = 0; i < 10; i++){printf("arr[%d] = %d\n", i, *(arr + i));}return 0;}
However, we can take this one step further and use pointer arithmetic all over the place. To fully understand the following code, we’ll create memory drawings and analyze the code step by step.
#include <stdio.h>int main(){int arr[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};int *startAddress = arr;int *endAddress = arr + 10;while (startAddress < endAddress){printf("arr[%d] = %d\n", (startAddress - arr), *startAddress);startAddress++;}return 0;}
Start and end addresses
Look at lines 7 and 8. We want to iterate over the entire memory block of the array. The start address is the address of the first element (the beginning of the block) on line 7.
The end address is the first address after the last element.
Elements 0, 1, …, 9 are valid indexes, so the next address after that is arr + 10
on line 8.
Loop condition
We can compare startAddress
and endAddress
since the pointers have the same type (pointer to int
). While startAddress < endAddress
, we are still inside the memory block of the array, and there are still elements to iterate over.
Once startAddress >= endAddress
we are outside the array and need to stop.
These two addresses define the starting and ending point of the array memory block.
Calculating the current position (former i
)
To get the index of the current element (0, 1, 2, …, 9) we use (startAddress - arr)
.
Let’s draw this to understand:
- In the above image,
startAddress
is atarr + 8
, which means thatstartAddress
=arr
+ 8 * 4. - In addition or incrementation, we add the value multiplied by the size of the type. Similarly, when we subtract two pointers, we get the value divided by the size of the type (the distance between the two addresses).
- Therefore,
startAddress
-addr
= (arr
+ 8 * 4 -arr
) / 4 = 32 / 4 = 8. - 8 is the index at which the
startAddress
is pointing (it isarr[8]
).
Reading the value
To read the value of the current array element, we use *startAddress
. We dereference startAddress
, which points to the beginning of the element and gives us the value.
Moving to the next element
To go to the next array element, we use startAddress++
. It will increment startAddress
by 1 * 4 = 4 bytes
at each iteration, moving to the next element until we encounter endAddress
and stop.
Let’s go through the animation below that illustrates this concept:
This concludes our deep dive! At this point, we should be comfortable with the equivalence between pointer and array notations and be able to use pointer arithmetic to work with arrays.
However, if it’s still not clear, don’t worry. We’ll solve a number of problems in this chapter and there will be coding challenges too!