Arrays Boundary, Capacity and Size, CString, and Filing
Learn about what arrays are, why we need them, and how we can and cannot use arrays.
Accessing arrays
In this lesson, we will learn what happens if we access arrays out of the boundary. Also, we will learn about character arrays and how they are handled differently. Lastly, we will discuss reading data from files instead of cumbersomely initializing it on our own.
Accessing arrays out of bound
Let’s start by asking a few questions to ourselves.
What if we try to access the array index that is out of bounds for our array? What would happen if we mistakenly modify the memory indexed outside the array size?
Note that the array index can’t be negative.
See the program below as an example.
#include <iostream> using namespace std; int main() { int A[5]={}; // array of size 5 for(int i=0; i<10000; i++) // accessing invalid array indexes (after index 4) { /* logical error since we are trying to access indexes beyond the array boundaries */ cout<<A+i<<": "<<"A["<<i<<"]: "<< A[i]<<"\n"; } return 0; }
After running the code above, we saw that, though we were accessing indexes outside of the array (after index 4
), the code did not give us any error.
However, this is a logical error.
The code above printed the five addresses and the 0
s stored there, but then it also printed the next five addresses and the values stored on those addresses, even though these memory locations are past the size of our array!
We are not supposed to access or modify the memory that is not part of our array.
C++ doesn’t necessarily warn us against such errors, which is why these errors can be very dangerous.
Some programming languages like Python and Java have protection against accessing an out-of-bounds element and throw an error if any such thing happens. However, to be safe, we should always be careful about array boundaries.
See the animation below to understand this undefined behavior. The red memory addresses represent the addresses that are out of bounds.
In the code widget above, we attempted to access memory outside of its permitted range. Let us try the same code as above on a different compiler. The code widget below uses a different compiler (and it has a strict policy of not allowing illegal memory access beyond the allocated space). Run the code and see the result.
#include <iostream>using namespace std;int main(){int A[5]={}; // array of size 5for(int i=0; i<10000; i++)// accessing invalid array indexes (after index 4){/* logical error since we are trying to accessindexes beyond the array boundaries */cout<<A+i<<": "<<"A["<<i<<"]: "<< A[i]<<"\n";}return 0;}
This indicates that printing elements from out-of-bounds memory locations in an array can result in unpredictable behavior. It may cause the program to crash immediately with a segmentation fault, or it may permit additional memory accesses before eventually leading to a crash.
Let us see how the GDB debugger behaves if we access the invalid
indexes and modify
the values.
Run and observe the program below as an example.
#include <iostream>using namespace std;int main() // logical Error{int A[5]={};for(int i=0; i<100; i++)// error because we are trying to access beyond boundariesA[i] = 0; // "*** stack smashing detected ***: terminated"return 0;}
When we try to access the array out-of-bounds memory and modify
the content stored there, the content that is already stored in those locations will be overwritten
.
Now, as long as there is memory in the allocated stack, the memory contents will be overwritten. However, as soon as a point is reached when the allocated stack memory ends, our program crashes
with the message stack smashing detected: terminated
. In short, when modifying a large number of out-of-bounds indexes, our program will crash, but when modifying only a few invalid indexes, the values of any of our memory variables will most likely be modified, causing undefined behavior.
Now, this undefined behavior where we might accidentally overwrite the memory contents is more dangerous than where we might only read the out-of-bounds memory locations.
See the animation below to understand this undefined behavior.
We have a question for you. Solve the quiz below.
Given the code below, what will be its output?
#include <iostream>
using namespace std;
int main()
{
int A[5]={1,2,3,4,5,6};
for(int i=0; i<5; i++)
cout <<A+i<<": "<< A[i]<<"\n";
return 0;
}
The program will print the five addresses and values 1, 2, 3, 4, 5
.
The program will not compile because there is a syntax error.
Setting array size and capacity
Suppose we ask the user for the size of the array and store it inside the size
variable and allocate the array of variable size:
int size;cout<< "The size of your array: "<< endl;cin>>size;int A[size]; // logical error
This is a logical error, because we cannot create a variable-sized array.
If we run the code above, we should get an error because we cannot initialize a variable-sized object. Some compilers do not give this warning, but it is still incorrect and does not make it valid.
Click “Run” to see what we mean.
#include <iostream> using namespace std; int main() { int size; cout<< "The size of your array: "<< endl; cin>>size; int A[size]; // logical error return 0; }
While it may be convenient to create an array of a size that we might think we need, C++ does not allow variable-size arrays. Its reason is beyond the scope of this course. In short though, the reason is that, before the execution, a compiler should know what exactly is the memory requirement of each block of the code, including functions and all the control structures like if
, if-else
, for
, while
, and so on; hence, variable-size memory is not possible.
Still, if you need to have a variable-size array, then, as a precautionary measure, one way is to make the size of the array a bit larger than the size we might think we need. So, say for some program, the user needs to make an array of size 50 or 75, but not more than 100; then, we can set a certain larger capacity (say 100) of the array.
For example, look at the code below. In line 7, we have created a variable capacity
and set it to 100
. Note that we used the keyword const
with the variable capacity
. The const
keyword ensures that the value cannot be changed by telling the compiler that the variable or function with which the const
is used will not be modified. This way, using const
prevents accidental modification of variables that should not be modified.
In line 8, we then declare an array of size capacity
.
Run and observe the program below.
#include <iostream>using namespace std;int main(){// larger capacity of our arrayconst int capacity=100;int A[capacity]={};// expected size of the arrayint size;cin>>size; // Use only size many entries...// the rest will be wastedfor(int i=0; i<size; i++){A[i] = i*10; // do whatever}for(int i=0; i<size; i++){cout << A+i << ": A["<<i<<"]: "<<" "<<A[i] << "\n";}return 0;}
Enter the input below
Instruction: Add size in the input stream (below 100). Execute the code and see what the array is after initialization.
Dynamic allocation is a technique that enables the use of a variable within square brackets. It is used to specify the size of an array at runtime by allocating memory during execution through the use of particular functions or keywords, which vary based on the programming language.
Character arrays
C++ handles character arrays in a special way. Before we talk about this specialty, let us see how to create and initialize character arrays.
Declaration and initialization
Look at the following codes and go through each one by one.
#include <iostream>using namespace std;int main(){char A[100]; // A contains garbage valuesfor(int i=0; i<5; i++) // Will print arbitrary 5 charactercout << A[i] << endl; // grabage valuescout <<endl;return 0;}
There is nothing new inside the code of the “cStringBasics” tab. A character array is declared with all garbage characters. Like an integer array, the entire character array can be displayed, just like any integer array.
The second example is a bit different inside the “cStringBasicsSpecialPrint” tab. An array is initialized with only three characters. The rest of the two characters are automatically assigned zeros (which is the ASCII value of '\0'
). Whenever a character address is placed on a character stream, instead of taking the address, the cout
reads character by character all the letters until it reads null character '\0'
. Hence, line 14 will print “ant”
, line 18 will print “nt”
, line 21 will print “t”
, and line 24 will print “ant”
.
In the third example, inside the “sizeOfCString” tab, a character array is declared of size=4
where the A[3] = '\0'
is automatically added. If we apply the sizeof(A)
, it counts the null
character in the allocated size of the array too. The double quotes in line 6 of the code indicate that the value being assigned to the character array A is a #key# A sequence of characters enclosed in double quotes and stored in memory as an array of characters with a null terminator at the end #key#. The string "ant"
is being assigned to the character array A
.
These null-terminating character arrays are called cstrings.
In C++, a cstring is a null-terminated character array, while a string is a general term that refers to a sequence of characters. We will further look into its details and solve several applications in the next chapter.
Reading data from the file
For reading the data from the file, we need to use #include<fstream>
. After that, we need to make sure the ifstream
variable (usually called object), initiated with the file name (as cstring
), is passed like a function call.
ifstream Reader("fileName.ext");
Reader
(you can use any name) acts like cin
, but instead of reading input from the console, it reads from a file. It functions as a file stream, which takes data from the file and reads it character by character into the code’s variables, like cin
. An example of file reading is shown here.
#include <iostream>#include <fstream>using namespace std;int main(){const int capacity=100;int size;int A[capacity]={};char B[capacity], asymbol, C[capacity];ifstream Rdr("Data.txt");Rdr>>size; // this will read 10 into sizefor(int i=0; i<size; i++) // Loop will run 10 timesRdr>>A[i]; // it will read 10 20 30 40 50 60 70 80 90 100// note that it ignores new lineRdr>>B; /* this will read "Hello": Again character arrays are read directly withoutany loop, the reading stops until the read character in the stream is' ' or '\n' or '\t' */Rdr>>asymbol; // it will ignore space ' ' and read 'T'Rdr>>C; // it will read "uring"cout << "size: "<<size << endl;for(int i=0; i<size; i++)cout << "A["<<i<<"]="<< A[i] << " ";cout << endl;cout << "B[] = \""<<B<<"\""<<endl;cout << "asymbol: '"<<asymbol<<"'"<<endl;cout << "C[] = \""<<C<<"\""<<endl;return 0;}
-
In line 7, the
const int capacity=100
sets a constant integer variablecapacity
to a value of100
. -
In line 9,
int A[capacity]={};
declares an integer arrayA
with a capacity of100
and initializes all elements to0
. -
In line 12, an input file stream object
Rdr
is created that is associated with the file“Data.txt”
.- Inside the
“Data.txt”
file, the first value10
represents the number of numeric values that follow, which are10 20 30 40 50 60 70 80 90 100
. This is then followed by the textHello Turing! Thank you for everything!!!
. Also, the variablesize
is used to control the loop that reads the remaining values in the file, so it needs to know the size of the arrayA
. It’s also why the first line in the data file is the size of the arrayA
.
- Inside the
-
In line 13, the first value
10
in the file is read and stored inside the variablesize
.The operator
>>
in the codeRdr>>size;
extracts the next value from the input file streamRdr
and stores it in the variablesize
. The operator>>
skips whitespace characters and reads until it encounters a non-numeric character. In this case, the first value in the file is10
, which is a number, so it stops reading after it encounters the first non-numeric character which, in this case, is a newline. That’s why only one value is read. -
In line 15, in each iteration of the
for
loop that runs10
times, the next value in the file is read and stored in the current element of the arrayA
. Although the newline character is not considered a whitespace or skipped by the operator>>
, it is ignored when the next value is read, which is the next number in the arrayA
. -
In line 18, the next string in the file is read and stored in the character array
B
. -
In line 22, the next character in the file is read and stored in the character variable
asymbol
. -
In line 23, the next string in the file is read and stored in the character array
C
. -
In lines 26–28, the values in the array
A
are printed to the console.
Instruction: Open and look at the “Data.txt”
file and then at the code and how the data is read instruction by instruction. There is no difference between the cin
and ifstream
readers, except for the file stream initialization process (because it needs to be initialized with the file name).
Summary
We hope you’ve learned how arrays can be declared, initialized, and easily traversed to find any of their elements. We also looked at character arrays and how special null-terminating cstring is dealt with in C++.
In the upcoming lessons, we will solve several challenging problems related to arrays.