The fork-exec idiom in Linux

Whenever we want to run some program, be it on our local computer/mobile phone or in the cloud, an entity called a process is created to fulfill the need. A process is an operating system's abstraction that provides two primary components:

Processor virtualization: An operating system provides virtual processor(s) to a user program, giving them the illusion that they have their dedicated processor to run the code.
Fault-tolerant execution: The operating system ensures that multiple processes, possibly of different users, can coexist amicably. A crash of one process should not impact other independent processes or the operating system.

Processes can be created using a fork system call, while we can load and execute a new code using an exec system call in a Linux kernel-based operating system. Fork-exec is an elegant way for user code execution. In this blog, we will learn about the fork-exec idiom.

Process creation is a fundamental concept, and questions related to the fork and exec system calls are often found in phone screening interviews. Typical questions could be on the following lines:

How is an exec call different from a fork call?
Can we call exec first and then a fork call?

You will be able to easily answer these questions by the end of this blog.

Note: If you are preparing for an interview, look at Educative’s 12-week roadmap to ace your interview.

High-level idea#

In Linux, the root of the process hierarchy is the initial or “root” process. This process is often assigned the process ID (PID) 1 and is traditionally named init. In modern Linux distributions that use systemd as the init system, the initial process may be named systemd or init depending on the distribution.

The root process, whether named init or systemd, serves as the ancestor of all other processes on the system. It uses fork-exec to create all the other processes. It’s responsible for initializing the system, launching various system services and daemons, and ensuring that essential components of the operating system are started.

The root process is fundamental to the overall operation of the Linux system and serves as the parent or ancestor of all other user and system processes. The following illustration shows the process tree of a Linux-based container on Educative’s platform. The process tree was generated using a program called htop. We can see that the htop program with the process ID 346 is the the child of bash shell with process ID 301, which in turn is the child of the first process with process ID 1. Here, the first process is neither init nor systemd; rather, it is a shell process initiated by the container sub-system.

The fork system call makes a copy of the parent process where many resources, such as most of the memory footprint, open files, etc., are shared by both the parent and the child process. A newly spawned child process has two options:

It can continue using the same code and other resources provided by the parent process. In this context, a child process is a way to generate parallelism where potentially, the parent and child processes are running concurrently.
The child uses the exec system call to load a new binary, effectively providing it its own resources and severing the sharing with its parent’s resources.

The following illustration shows a typical flow of parent and child processes. A process calls the fork() system call. If the call is successful, a new child process is created. At this point, two processes exist—parent and child. They can go about doing their own business. Once a child process is done with its work, it terminates itself using the exit call and a final status code. By convention, a status code of 0 means success, and anything other than 0 can be an error. The parent can see a child process’s final status by using the wait system call.

Arguments

The fork system call does not accept any arguments. It is called without any parameters.

Return value

In the parent process, the fork system call returns the process ID (PID) of the newly created child process.
In the child process, the fork system call returns 0.
If an error occurs during the fork operation, it returns -1.

Explanation

The fork system call is used to create a new process by duplicating the existing process. Here’s a step-by-step explanation of how it works:

When the fork system call is invoked, the operating system creates a new child process by duplicating the current (parent) process. This includes duplicating the entire process context, which includes the program counter, CPU registers, open-file descriptors, environment variables, and other relevant process attributes.
The new child process is essentially a copy of the parent process at the time of the fork call. Both the parent and child processes start executing code from the same point (the return from the fork call).
In the parent process, the fork system call returns the PID of the newly created child process. This PID is unique to the child process, and it is different from the parent’s PID.
In the child process, the fork system call returns 0. This allows the child process to identify itself as the child and perform different actions if needed.
If an error occurs during the fork operation (e.g., due to resource limitations or system constraints), it returns -1, indicating that the child process was not created.

Programmers can use the return value of the fork system call to differentiate between the parent and child processes and execute different code paths in each. This is a fundamental mechanism for creating new processes and enabling concurrent execution in Unix-like operating systems.

Let’s discuss a basic example of using the fork system call in C. The following program demonstrates the basic use of the fork system call to create a child process and print the PIDs of both the parent and child processes.

Here’s an explanation of the arguments and return value of the execve system call.

Arguments

pathname (const char *): This is a pointer to a string that specifies the path to the executable file we want to execute. It should include the full path to the program, or if it’s in one of the directories listed in the PATH environment variable, we can specify just the program name (e.g., “ls” for /bin/ls).
argv (char *const[]): This is an array of strings representing the command-line arguments passed to the new program. The last element of the array must be NULL to indicate the end of the argument list. Each string in the array represents an argument. The first element is typically the name of the program itself.
envp (char *const[]): This is an array of strings representing the environment variables to be passed to the new program. It’s typically set to NULL to indicate that the child process should inherit the environment of the parent process. If we want to set specific environment variables, we can create an array of strings with the format, “NAME=VALUE” (e.g., “PATH=/bin:/usr/bin”).

Return value

The execve system call does not return if it is successful. Instead, it replaces the current process image with the specified program. If the execve system call encounters an error, it returns -1, and we can use perror or other error-handling mechanisms to diagnose the issue.

In summary, the execve system call is used to execute a new program, and it takes the program path, command-line arguments, and environment as its arguments. If successful, it replaces the current process with the new program and doesn’t return. If it fails, it returns -1 to indicate an error, and we can use error-handling techniques to handle the failure.

Explanation

The execve system call is used to replace the current process image with a new program. It allows us to specify the path to the executable program, command-line arguments, and the environment explicitly. Here’s a simple example in C that demonstrates how to use the execve system call:

C

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main() {
    // Path to the program you want to execute
    char *programPath = "/bin/ls";  
    // Command-line arguments
    char *const programArgs[] = {"/bin/ls", "-l", "/", NULL};  
    // Environment variables (typically left as NULL for the current environment)
    char *const programEnv[] = {NULL};  
    // Use execve to replace the current process with the specified program
    int execResult = execve(programPath, programArgs, programEnv);
    // If execve returns, an error has occurred
    perror("execve");
    return 1;
}

In this example:

The programPath is a string containing the path to the program we want to execute. In this case, it’s set to /bin/ls.
The programArgs is an array of strings represents the command-line arguments to the program. The last element of the array must be NULL to signal the end of the argument list. In this example, we run the ls command with the -l and / arguments.
The programEnv is an array of strings represents the environment variables. In this example, we’ve left it as NULL, which means the child process will inherit the current environment.
The execve function is called with the program path, command-line arguments, and the environment. If the execve function is successful, it replaces the current process with the specified program, and the code after the execve function is not executed.
If the execve function encounters an error and returns, perror is used to print an error message to the standard error output.

Note: Try running another program, let’s say cat. Make sure to replace the programPath with the path to the program you want to run and modify the programArgs and programEnv arrays as needed for your specific use case.

The execve system call is a powerful way to execute other programs from our code, and it’s commonly used in process management and system administration tasks.

Arguments

status: A pointer to an integer where information about the terminated child process is stored. This information includes the exit status and other termination details.

Return value

pid_t: The return type of wait is the PID (process ID) of the terminated child process. If an error occurs, it returns -1.
In the parent process, the wait system call returns the PID of the terminated child process if successful. If no child process has terminated, it blocks until one does.
In the child process, the wait system call returns -1, indicating an error, as a child process should not be waiting for another child to terminate.

Behavior

If there are multiple child processes that have terminated, the wait system call returns the PID of one of them, but not necessarily in any specific order. We can use the waitpid function to wait for a specific child process by specifying its PID.
The status pointer is used to retrieve information about the terminated child process, including its exit status, termination reason, and other details. We can use macros like WIFEXITED and WEXITSTATUS to extract information from the status value.
In the context of the wait family of system calls in Unix-like operating systems, WIFEXITED and WEXITSTATUS are macros that are typically used to check and retrieve information about the termination status of a child process. Here’s what they mean:
- WIFEXITED:
  - This macro is used to check if a child process has terminated normally (i.e., it has exited).
- WEXITSTATUS:
  - This macro is used to retrieve the exit status of a child process that has terminated normally (when WIFEXITED is true). The exit status is a value that the child process returned when it exited.

Here’s a simple example of how the wait system call can be used in a C program:

C

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int main() {
    pid_t child_pid = fork();
    if (child_pid < 0) {
        perror("fork");
        exit(1);
    } else if (child_pid == 0) {
        // This code is executed in the child process.
        // Perform child process tasks here.
        exit(42);  // Exit with a status code (e.g., 42).
    } else {
        int status;
        pid_t terminated_pid = wait(&status);
        if (WIFEXITED(status)) {
            printf("Child process %d terminated with exit status: %d\n", \
            terminated_pid, WEXITSTATUS(status));
        }
        // Parent process continues here.
    }
    return 0;
}

The fork-exec idiom in Linux

High-level idea#

The fork system call#

The exec family of system calls#

Getting the status of a child process#

Conclusion#