Pointers
Get an introduction to pointers and memory drawings. Learn about pointer usage, operators, and reading pointer declarations.
Introduction
While working with variables, we choose different types for them, depending on what kind of data we want to store.
For example:
- To store
5
, we typically use anint
variable. - To store
3.5
, we use adouble
variable. - To store
a
, we use achar
variable.
#include <stdio.h>int main(){int i = 5;double f = 3.5;char c = 'a';printf("i = %d, f = %f, c = %c\n", i, f, c);return 0;}
Well, we can view pointers as a different data type. They don’t store a number or a character. Pointers hold a reference to another variable or the address of another variable.
The address of a variable is the point in memory where the operating system decided to place it or the address of the box where we put that variable.
To create a pointer, we can use type* ptr
. In other words, add an asterisk *
to the regular variable declaration.
To create a variable of type pointer to an integer (a pointer variable that can point to an integer), we can use the following syntax:
int* intPtr; //for int pointer
double* doublePtr; //for double pointer
char* charPtr; //for char pointer
Pointers are like regular variables and are stored inside a box in memory. The only difference is that they don’t hold value in the traditional sense. Pointers store a reference to another variable or the address of a variable. Check the following image:
This sort of indirection may feel weird at first, but imagine we want to go to our new friend’s house:
- It’s our first visit so we’re not sure how to get there.
- We start asking people on the street, and they point us to our friend’s address.
- At our friend’s house address, we find our friend.
- Pointers work in the same way.
- They point (like the people on the street) to the memory location (house in our example), or address of a variable.
- We know that by following the pointer and reaching that address, we’ll be able to interact with the variable.
Why pointers?
One could ask: “Well, I already know the variable because I created it. Can I just use it directly and not deal with pointers?”. Good question!
Pointers solve a few issues, such as:
- Sharing data between multiple functions. We can do it by copying the data several times, but this can become inefficient. Pointers allow us to pass a reference to our data. No copy to perform!
- Creating data structures like lists, trees, matrices, graphs, etc. They have a linked internal structure, which pointers can easily represent.
- Writing shorter and modular code that is more resistant to change.
- Allowing us to implement genericity in C.
These points are more complex than they would seem at first glance, and we’ll explore them in depth during this course.
Address-of operator
Let’s look at how we can create a pointer and how to point it to another variable. We need to introduce the following operators:
*
(dereference operator)&
(address-of operator)
The &
operator is called the address-of operator and is used to find out or get the address of a variable, for example, to get the memory location of the box where that variable is stored.
The syntax is &variable
.
Let’s use this operator to retrieve the addresses of two variables- i
and j
.
#include <stdio.h>int main(){int i = 15;int j = 22;printf("Address of i is %p\n", &i);printf("Address of j is %p\n", &j);return 0;}
Wow! The output is scary. What even is that!?
The boxes or bytes inside the memory are indexed or numbered. Their address is a number, starting from 0 and going up depending on how much memory the system has. The operating system reserves some memory areas. While writing code, it’s unlikely we’ll get a small address like 0, 5, or 1000. But regardless, they are still numbers.
Check the memory snapshot below, it shows a few bytes and their addresses/indexes.
Now, look at the output for the address of variable i
. Assume it is 0x7ffdee2428ec (it can be different if we run the code multiple times), and while scary, it’s just a number written in base 16 or hexadecimal system.
We learned that computers store data in base 2. However, the convention for displaying or working with memory addresses is to do it in base 16.
The %p
specifier we used inside printf
prints the address in hex. We can convert the number to base ten and obtain
0x7ffdee2428ec = 140728598800620
.
It’s not mandatory to use hex, but this is the usual practice.
In base 16, we use 0–9, and A–F (see the next table). In other words, we use all the numbers lesser than 16, which is the base of the hexadecimal number system.
Hexadecimal system
HEX | Decimal |
A | 10 |
B | 11 |
C | 12 |
D | 13 |
E | 14 |
F | 15 |
We don’t need to know how to convert from or to base 16 but that an address is a number that uniquely identifies a box or starting memory location of a variable.
Now that we know how to get the address of a variable, we can write a pointer to it.
#include <stdio.h>int main(){int i = 15;int j = 22;printf("i = %d\n", i);printf("j = %d\n", j);printf("Address of i is %p\n", &i);printf("Address of j is %p\n", &j);int *intPtr1 = &i; //pointer to iint *intPtr2 = &j; //pointer to jprintf("intPtr1 = %p\n", intPtr1);printf("intPtr2 = %p\n", intPtr2);return 0;}
Printing i
and j
gives us the values we put inside the variables.
Printing &i
and &j
gives us the memory addresses of the i
and j
variables.
Printing intPtr1
and intPtr2
gives us the address of the variables that the pointers point to. Notice that the address of i
is the same as intPtr1
because the pointer stores the address of the variable i
.
Dereference
Now that we have a pointer to a variable, we may want to be able to read/write that variable using it. To do that, we need to use the dereference operator *
with the following syntax:
*pointer = something; //to write
something = *pointer; //to read
Let’s see an example:
#include <stdio.h>int main(){int i = 15;printf("i = %d\n", i);int *intPtr = &i;printf("i = %d\n", *intPtr); //reading from i through the pointer*intPtr = 100; //writing to i through the pointerprintf("i = %d\n", i); //check if we really changed i? Yes, we did!return 0;}
In line 11, we print the value of i
using the pointer by dereferencing it.
In line 12, we change the value stored inside i
using the pointer by dereferencing it.
Check the following memory drawing, where intPtr
points to the i
variable.
Other operations with pointers
We can reassign pointers and make them point to something else (if they are not constant). A constant pointer uses the const
keyword with the same meaning as for variables.
int i = 5;
int j = 6;
int *intPtr = &i; //intPtr points to i
//later
intPtr = &j; //intPtr now points to j
Pointers support the assignment operation. Assigning a pointer variable to another makes both pointers point to the same variable. The variables are not affected in any way.
Consider the following code:
#include <stdio.h>int main(){int x = 3, y = 5;int *ptr1 = &x;int *ptr2 = &y;//ptr1 points to x, ptr2 to yprintf("Before assignment: %p %p\n", ptr1, ptr2);//now ptr2 points to the same variable as ptr1, which is xptr2 = ptr1;printf("After assignment: %p %p\n", ptr1, ptr2);return 0;}
Notice that both pointers hold the same memory address and refer to x
(declared at line 6). We reassigned ptr2
to point to the same thing as ptr1
.
The code is equivalent to the following:
ptr2 = &x;
The NULL
pointer
What happens when we create a pointer without initializing it?
int* intPtr;
Where does intPtr
point?
It gets initialized with garbage (whatever random values were in the memory location of intPtr
previously) and does not point to somewhere valid.
Attempting to work with such a pointer leads to undefined behavior (it may work one time, and crash another time, depending on what was in the memory) and hard-to-track bugs!
To test if we set the pointer to something or not, we can use a constant value called NULL
(usually defined as 0). Here, NULL
means a pointer points nowhere, but the advantage is that we can check against NULL
and avoid dereferencing an invalid pointer.
Let’s see an example code. In the code below, we create three pointers—intPtr
, anotherPtr
, and lastPtr
.
Then, we explore the consequences of leaving them uninitialized, setting them to NULL
and a valid memory address.
We’ll see intPtr
is very dangerous, and we have no way of knowing it is uninitialized.
The anotherPtr
pointer is invalid, but it is NULL
. We can guard against dereferencing it with an if check.
The lastPtr
pointer contains a valid memory address and everything works fine.
Please follow the comments inside the code to find out the differences:
#include <stdio.h>int main(){int *intPtr;//Do not use intPtr, the pointer is not valid.//Trying to dereference(*) results in undefined behavior could crash the program.//Later in the code, we have no way of knowing if intPtr is set or not.int *anotherPtr = NULL;//anotherPtr is set to NULL, meaning it does not point anywhere//we can check that with an if and avoid dereferencing itif (anotherPtr != NULL){printf("anotherPtr is valid, we can dereference it!\n");}else{//this gets executedprintf("anotherPtr is invalid, assign it to a variable before dereferencing it!\n");}int x = 5;int *lastPtr = &x;if (lastPtr != NULL){//this gets executedprintf("lastPtr is valid, we can dereference it!\n");*lastPtr = 12;}else{printf("lastPtr is invalid, assign it to a variable before dereferencing it!\n");}return 0;}
Note: Always assign pointers to
NULL
when not setting a valid address immediately.
The intPtr
pointer is a (wild) pointer, it doesn’t contain a valid address, and it’s not NULL
either.
Reading pointer declarations
A basic pointer declaration is pretty straightforward, but what happens when we introduce more keywords?
char* ptr1;
const int* ptr2;
int* const ptr3;
const int* const ptr4;
What are ptr1
, ptr2
, ptr3
, and ptr4
? Is the pointer constant? Or the variables? Or both? Is ptr3 a const
pointer, or does it point to a const
variable?
A nice trick is to read pointer declarations from right to left.
ptr1
is a pointer (*
) tochar
.ptr2
is a pointer (*
) toint const
. Theptr2
variable points to is constant and can not be changed. We can changeptr2
and point it somewhere else.ptr3
is a constant pointer (const *
) to anint
. Theptr3
variable itself is constant and can be assigned only once. The variable thatptr
points to isn’t constant and can be changed.ptr4
is a constant pointer (const *
) to a constant (const int
). Theptr4
variable itself is constant and can’t be changed to point to a different variable. The variable thatptr4
points to is also constant and can’t be changed.
Check the animation below:
The code below goes through the examples and contains both correct and incorrect assignments.
#include <stdio.h>int main(){int x = 5;const int y = 6;int *ptr1 = &x;const int *ptr2 = &y;int *const ptr3 = &x;const int *const ptr4 = &y;//ptr1 is a pointer(*) to int//it can point to x, but can not point to y because y is not int, it is const int//the type of the pointer must match the type of the variable//ptr1 can point to a different variable//the variable ptr1 points to can be changed//ptr2 is a pointer(*) to const in//it can point to y, but can not point to x because x is int, not const int//the line below will result in a compilation error, because ptr2 points to a//constant variable and can not be changed//*ptr2 = 7;//however, ptr2 itself can be made to point to another variable//the lines below will compile fine//const int p = 44;//ptr2 = &p;//ptr3 is a constant pointer(const *) to int//it can not point to another variable, because it is constant//the lines below will result in a compilation error(we try to make ptr3 point to z)//int z = 11;//ptr3 = &z;//howerver, we can change x thru ptr3. the line below will set x to 122//*ptr3 = 122;//ptr4 is a constant pointer(*) to a constant int(const int)//ptr4 can not be made to point to another variable because it is constant//the variable that ptr4 points to cannot be changed, as it is constant//the lines below result in a compilation error//const int w = 20;//ptr4 = &w; //make ptr4 point to something else//*ptr4 = 999; //change the variable ptr4 points toreturn 0;}
Feel free to play with the code, comment, and uncomment lines and see what happens! Make sure to read the declarations from right to left to understand them.