Home/Blog/Programming/Decoding C Declarations
Home/Blog/Programming/Decoding C Declarations

Decoding C Declarations

Aug 11, 2023
15 min read
content
Declarator
Type Specifier
Storage Class
Type Qualifier
Type Qualifier Rule
Explanation of Type Qualifier Rule
Rules for Understanding C Declarations
Precedence Rules
Application of Rules
More Examples
Summary
Exercises
Answers
The Next Steps
References
share

Become a Software Engineer in Months, Not Years

From your first line of code, to your first day on the job — Educative has you covered. Join 2M+ developers learning in-demand programming skills.

The C programming language is notorious for its type declarations. The programming language was designed more than 50 years ago. The designers of the language, apparently, didn’t pay much attention to making it easier to understand declarations. Consider the following declaration.

int *p[4];

How should we read it? Is the above statement declaring p to be an array of four elements with each element pointing to an integer, or is it a pointer to an array of four elements each of which is an integer?

Two possible graphical representations of int *p[4];
Two possible graphical representations of int *p[4];

The above example is simple. We know that p is an array of four elements, each of which is a pointer to an integer. Therefore, in the figure above, the graphical representation on the left is correct. Once we learn to decode C declarations, we will write the declaration for the graphical representation on the right.

The following declarations are more complicated.

char *(*(*a)())[10];
int *(* const *b[8]) (void);
char * const * (*c)(void);
char *(*(*p[4])(char *))[];
void (*s(int, void (*)(int)))(int);
void *(*f(int))(int);
struct IMAGE *(*(*(*fp)[5]))(const char *, int);
char ** const * volatile x;
char *(*(**f[][4])())[];

In this blog, we will learn how to read C declarations and apply that knowledge to convert the above declarations into simple English. We will first define some terminology and then outline the rules which will enable us to convert any declaration into a simple English sentence.

Declarator

A declarator is a simple identifier (also called variable name), an array identifier (also called array variable name), a function name, or a pointer to any of the above, optionally followed by an equal sign and initial value or values. For example, first = 4, second[4] = {1, 1, 2, 3}, third(), *fourth, *fifth[4] and *sixth() are all valid declarators in the following declarations.

int first = 8;
int second[4] = {1, 1, 2, 3};
int third();
int *fourth;
int *fifth[4];
int *sixth();

There may be any number of pointers, such as ***seventh, any number of array dimensions, such as eighth[4][5][6]; but only one pair of function parentheses. The declarator ninth()() is invalid. The declarators (*p)()[] , and(*p)[]() are also invalid.

An identifier, an identifier with array square brackets, or an identifier with function parentheses is also called a direct declarator. In the above examples, first, second[4], third(), fourth, fifth[4], and sixth() are direct declarators.

Type Specifier

Type specifiers are char, double, float, int, long, signed, unsigned, enum, struct, and union. The keywords enum, struct and union are usually followed by what is called a tag. The keywords struct and union declare complex types.

Storage Class

The storage class of a variable tells a compiler how to allocate memory for that variable. There are five storage classes, auto, extern, register, static, and typedef. The typedef storage class doesn't tell a compiler about memory allocation. It only defines a new name for a data type.

Type Qualifier

As of this writing, there are four type-qualifiers; const, restrict, volatile, and _Atomic. The type qualifiers restrict and _Atomic were introduced in C99 and C11 standards. _Atomic is not only a type qualifier, but it is also a type specifier when used with standard type specifiers. For example, _Atomic(int) is a type specifier and not a type qualifier. We will discuss _Atomic in detail in another other blog.

Type Qualifier Rule

If a type qualifier or qualifiers appear next to a type specifier ( int, char, float, double, etc.) it applies to that type-specifier. Otherwise, it applies to the asterisk pointer to its immediate left. The type qualifier restrict only applies to pointers.

We will apply this rule several times in decoding C declarations so that it becomes clear.

Explanation of Type Qualifier Rule

Consider the following declaration.

int const *p;
const int *q;

The const keyword is next to a type specifier ( int ) in both declarations, therefore it applies to the type and not to the pointer asterisk. In the following declaration, the const keyword is not next to the type specifier, hence it applies to the pointer asterisk to its immediate left.

char * const r;

Rules for Understanding C Declarations

We locate the first identifier reading from the left and then follow the precedence rules.

Precedence Rules

Rule 1. Read the postfix operators (square brackets indicating an array and parentheses indicating a function) from left to right, till the semicolon or the closing unmatched parenthesis is reached.
Rule 2. Read the prefix asterisk operators indicating a pointer, till the beginning of the declaration or the opening parenthesis, corresponding to the closing parenthesis of Rule 1, is reached.
Rule 3. If a type qualifier or qualifiers appear next to a type specifier ( int, char, float, double, etc.) it applies to that type-specifier. Otherwise, it applies to the asterisk pointer to its immediate left. The type qualifier restrict only applies to pointers.

Application of Rules

Let's apply the above rules to understand the very first declaration we talked about, i.e. int *p[4];.

In the following illustrations, the red arrow indicates starting position and the green arrow indicates the ending position. The rule under consideration is applicable to the text between the two arrows. The purple color is used to indicate the text that has already been processed.

The first identifier in the above declaration is p.

  • We apply Rule 1 and read the postfix operator (in this case square brackets indicating an array) till we reach the semicolon, " p is an array of 4 . . .".

Application of Rule 1.
Application of Rule 1.
  • Since a semicolon marks the end of a declaration, we stop the application of Rule 1 and apply Rule 2. We read the prefix operator (asterisk indicating a pointer), preceded by the type specifier int , till we reach the beginning of the declaration, " p is an array of 4 pointers to integers."

Application of Rule 2.
Application of Rule 2.

Before moving on to other complex declarations, let's see which C declaration would correspond to the graphical representation shown below.

A pointer pointing to a block of four integer values.
A pointer pointing to a block of four integer values.

It shows p to be a pointer to an array of 4 integers. Since the pointer asterisk has lower precedence than array brackets and function parentheses, we have to enclose *p within parentheses to elevate its precedence as shown below.

int (*p)[4];

Let's start with p and read the declaration.

  • We attempt to apply Rule 1 but do not find any postfix operators. We find an unmatched closing parenthesis.

Rule 1 is not applicable because of closing parenthesis.
Rule 1 is not applicable because of closing parenthesis.
  • We apply Rule 2 to the prefix asterisk operator (pointer) till we reach the opening parenthesis and read, " p is a pointer to . . .".

Application of Rule 2.
Application of Rule 2.
  • We have read whatever we found inside the parentheses and apply Rule 1 to the part of the declaration outside the parenthesis. We find a postfix operator (in this case the square brackets indicating any array), followed by the semicolon indicating the end of the declaration, and read, " p is a pointer to an array of 4 . . .".

Application of Rule 1.
Application of Rule 1.
  • We have reached the end of the declaration but still have a part of the declaration to read. We apply Rule 2 but do not see any prefix operators. Instead, we find the type specifier int before reaching the beginning of the declaration line, and read, " p is a pointer to an array of 4 integers."

Rule 2 is not applicable because of type specifier.
Rule 2 is not applicable because of type specifier.

More Examples

  1. Let us apply what we have learned to convert the following declaration into simple English.
    char *(*(*a)())[10];

  • The first identifier in the declaration is a. This is where we start.

Starting point is the first identifier found while scanning from left.
Starting point is the first identifier found while scanning from left.
  • We find an unmatched closing parenthesis to the right of a. We cannot apply Rule 1 as there are no postfix operators.

Rule 1 is not applicable because of the closing parenthesis after the identifier.
Rule 1 is not applicable because of the closing parenthesis after the identifier.
  • We apply Rule 2 to the part up to the opening parenthesis, which includes a prefix asterisk operator indicating a pointer, and read, "a is a pointer to . . .".

First application of Rule 2.
First application of Rule 2.
  • We have taken care of the innermost parentheses. We apply Rule 1 to the part of the declaration up to the next unmatched closing parenthesis. The postfix operator we find is a pair of parentheses indicating a function. We read, "a is a pointer to a function that has no parameters . . .".

First application of Rule 1.
First application of Rule 1.
  • We apply Rule 2 to the part of the declaration up to the opening parenthesis as shown, and read, "a is a pointer to a function that has no parameters and returns a pointer to . . .".

Second application of Rule 2.
Second application of Rule 2.
  • We have read the part of the declaration in the outermost pair of parentheses. We apply Rule 1 to the remaining part of the declaration. We find a postfix operator (square brackets in this case) indicating an array, followed by the semicolon. We read, "a is a pointer to a function that has no parameters and returns a pointer to an array of 10 . . .".

Second application of Rule 1.
Second application of Rule 1.
  • We reached the end of the declaration while applying Rule 1. We now apply Rule 2 to the remaining part of the declaration. We find a prefix asterisk operator indicating a pointer, preceded by the type specifier char. This takes us to the beginning of the declaration. We read, "a is a pointer to a function that has no parameters and returns a pointer to an array of 10 pointers to characters."

Third application of Rule 2.
Third application of Rule 2.

This is a complicated function pointer declaration. The following program shows how this declaration could be used in a C program.

#include <stdio.h>
#include <stdlib.h>
char *(*(myfunc)())[10]
{
char *(*p)[10];
p = malloc(sizeof(char *) * 10);
/* process the allocated 80-byte block as required */
return (p);
}
int main(int argc, char *argv[], char *envp[]) {
char *(*q)[10];
char *(*(*a)())[10] = myfunc;
printf("Size of pointer on this machine: %lu bytes\n", sizeof(char *));
q = a();
fprintf(stdout, "p: %p\t p+1: %p\n", q, (q+1));
return (0);
}

On 64-bit machines, all pointers ( char *, char **, char ***, and so on) are 8-byte long. Compiling and executing the above program produces output like shown below.

Size of pointer on this machine: 8 bytes
q: 0x600001e90d20 q+1: 0x600001e90d70

We observe that even though q is an 8-byte pointer, advancing it by 1 changes the address by 0x50 or 80 bytes. This confirms that q is indeed a pointer to an array of 10 pointers to characters, exactly as we found by decoding it.

  1. Let us decode one more complex C declaration, which will require applying Rule 3If a type qualifier or qualifiers appear next to a type specifier ( int, char, float, double, etc.) it applies to that type-specifier. Otherwise, it applies to the asterisk pointer to its immediate left. The type qualifier restrict only applies to pointers.. Here is the declaration we want to convert to simple English.
    char ** const * volatile x;

  • We find the first identifier in the declaration, which is x.

First identifier from left.
First identifier from left.
  • There is a semicolon to the immediate right of x hence we cannot apply Rule 1Read the postfix operators (square brackets indicating an array and parentheses indicating a function) from left to right, till the semicolon or the closing unmatched parenthesis is reached. to this declaration. To the immediate left of x is the type qualifier volatile which means we have to apply Rule 3If a type qualifier or qualifiers appear next to a type specifier ( int, char, float, double, etc.) it applies to that type-specifier. Otherwise, it applies to the asterisk pointer to its immediate left. The type qualifier restrict only applies to pointers.. In this case, according to Rule 3If a type qualifier or qualifiers appear next to a type specifier ( int, char, float, double, etc.) it applies to that type-specifier. Otherwise, it applies to the asterisk pointer to its immediate left. The type qualifier restrict only applies to pointers., the type qualifier volatile applies to the asterisk (pointer) to its immediate left.

First application of Rule 3.
First application of Rule 3.
  • Since the type qualifier applies to the asterisk to its immediate left, we stop here temporarily and read till this point. " x is a volatile pointer to . . .".

  • We find const to the left of the constant pointer. According to Rule 3If a type qualifier or qualifiers appear next to a type specifier ( int, char, float, double, etc.) it applies to that type-specifier. Otherwise, it applies to the asterisk pointer to its immediate left. The type qualifier restrict only applies to pointers., it applies to the asterisk (pointer) to its immediate left. We read, " x is a volatile pointer to a constant pointer . . .".

Second application of Rule 3.
Second application of Rule 3.
  • We apply Rule 2Read the prefix asterisk operators indicating a pointer, till the beginning of the declaration or the opening parenthesis, corresponding to the closing parenthesis of Rule 1, is reached. to the remaining part of the declaration. We have a prefix asterisk operator (pointer) preceded by the type specifier char. We read, " x is a volatile pointer to a constant pointer to a pointer to a character."

Application of Rule 2.
Application of Rule 2.
  1. A variable may be initialized in the declaration. Let us consider such a declaration.
    int ( * cmp ) ( const void *, const void * ) = ascending ;

  • The identifier cmp is our starting point.

First identifier from left is `cmp`.
First identifier from left is `cmp`.
  • We try to apply Rule 1Read the postfix operators (square brackets indicating an array and parentheses indicating a function) from left to right, till the semicolon or the closing unmatched parenthesis is reached., looking for postfix operators. We find an unmatched closing parenthesis to the right of cmp. We read, " cmp is . . .".

Unable to apply Rule 1 because of the closing parenthesis.
Unable to apply Rule 1 because of the closing parenthesis.
  • We apply Rule 2Read the prefix asterisk operators indicating a pointer, till the beginning of the declaration or the opening parenthesis, corresponding to the closing parenthesis of Rule 1, is reached., looking for prefix operators, till we reach an opening parenthesis or the beginning of the declaration. We find the asterisk operator to the immediate left of cmp , preceded by the opening parenthesis. We read, " cmp is a pointer to . . .".

Application of Rule 2.
Application of Rule 2.
  • We apply Rule 1Read the postfix operators (square brackets indicating an array and parentheses indicating a function) from left to right, till the semicolon or the closing unmatched parenthesis is reached., looking for postfix operators. We find an opening parenthesis to the right of (*cmp) indicating a function. We continue till we reach the corresponding closing parenthesis, and read, " cmp is a pointer to a function (which has two parameters, both are pointers to constant void)".

Application of Rule 1.
Application of Rule 1.
  • We still haven't reached a semicolon or an unmatched parenthesis, so we continue applying Rule 1Read the postfix operators (square brackets indicating an array and parentheses indicating a function) from left to right, till the semicolon or the closing unmatched parenthesis is reached.. We find an equal sign ( = ) indicating an initializer. Let's handle it at the end.

  • We apply Rule 2Read the prefix asterisk operators indicating a pointer, till the beginning of the declaration or the opening parenthesis, corresponding to the closing parenthesis of Rule 1, is reached. looking for the prefix operators. We find int to the immediate left of (*cmp) which is a type specifier. We read, " cmp is a pointer to a function (which has two parameters, both are pointers to constant void) and returns an integer."

Unable to apply Rule 2 because of type specifier `int`.
Unable to apply Rule 2 because of type specifier `int`.
  • The initialization part of the declaration stores the value of the variable ascending (which must be a function of the appropriate type, as mentioned in the declaration) in the identifier cmp.

Finally, let's look at the most complicated declaration in the list given at the beginning.

struct IMAGE *(*(*(*fp)[5]))(const char *, int);

On applying the rules, we obtain the following simple English representation:

" fp is a pointer to an array of 5 pointers to pointer to functions (whose first parameter is a pointer to a constant character and the second parameter is an integer) and returns a pointer to struct IMAGE."

The figure below shows the sequence in which this complex declaration is handled, by numbering its various parts.

Traversal sequence to decode the declaration.
Traversal sequence to decode the declaration.

Summary

Every C declaration begins with a type specifier, such as char, int, double, etc, or a type qualifier const or volatile. The type qualifier restrict cannot begin a declaration as it applies to pointers only. The type specifier could be one keyword, such as int, or multiple keywords, such as unsigned long int, or long double. Type specifier may have the struct, union, and enum keywords.

We start with the first identifier from left, applying Rule 1 (postfix operators) till we encounter an unmatched closing parenthesis or a semicolon indicating the end of the declaration. Then we apply Rule 2 (prefix operators) till we encounter an opening parenthesis or reach the beginning of the declaration.

We alternate between Rule 1Read the postfix operators (square brackets indicating an array and parentheses indicating a function) from left to right, till the semicolon or the closing unmatched parenthesis is reached. and Rule 2Read the prefix asterisk operators indicating a pointer, till the beginning of the declaration or the opening parenthesis, corresponding to the closing parenthesis of Rule 1, is reached. (alternating from right to left and back to right, starting with the first identifier from left) till the entire declaration has been read. We apply Rule 3If a type qualifier or qualifiers appear next to a type specifier ( int, char, float, double, etc.) it applies to that type-specifier. Otherwise, it applies to the asterisk pointer to its immediate left. The type qualifier restrict only applies to pointers. when we encounter any type qualifier along the way.

With this knowledge, we can decode any valid complex C declaration into simple English.

Exercises

Please decode the following declarations for more practice. Answers are provided to verify your work.

int *(* const *b[8]) (void);
char * const * (*c)(void);
char *(*(*p[4])(char *))[];
void (*s(int, void (*)(int)))(int);
void *(*f(int))(int);
char ** const * volatile x;
char *(*(**f[][4])())[];

Answers

  1. b is array 8 of pointers to const pointers to function which takes no parameters and returns a pointer to int

  2. c is a pointer to a function with no parameters returning a pointer to const pointer to char

  3. p is an array 4 of pointer to functions that has a pointer to char parameter returning a pointer to an array of pointers to char

  4. s is a function that takes two parameters, the first one is an int and the second one is a pointer to a function that takes an int and returns void, returning a pointer to a function that has an int parameter and returns void

  5. f is a function that takes an int parameter and returns a pointer to a function that takes an int parameter and returns a pointer to void

  6. x is volatile pointer to const pointer to pointer to char

  7. f is a two-dimensional array (second dimension is 4) of pointer to pointer to function, that takes on parameters, returning pointer to array of pointer to char

The Next Steps

Browse the following courses to learn more about C programming language.

  1. Advanced Programming Techniques in C

  2. C Programming for Experienced Engineers

References

  1. https://www.iso-9899.info/wiki/The_Standard

  2. C Programming Language, 2nd Edition, Brian W. Kernighan, Dennis M. Ritchie

  3. C: A Reference Manual, 5th Edition, Samuel Harbison, Guy Steele Jr.

  4. Expert C Programming: Deep C Secrets, Peter van der Linden