Unique and Distinct Values Using Arrays
Learn to create a program that searches for unique and distinct elements within an array.
We'll cover the following
In this lesson, we’ll learn about two special numbers
the terms “unique” and “distinct” are used to describe the elements of a set or collection.
A “unique” element is an element that appears only once in a set or collection. If an element appears more than once in a set, it is not considered to be unique.
A “distinct” element is an element that is different from all other elements in a set or collection. It doesn’t matter how many times an element appears in the set, as long as it is different from all other elements, it is considered to be distinct.
In other words, all unique elements are distinct, but not all distinct elements are unique.
For example, consider the set [1, 2, 3, 4]. Elements 1, 2, 3, and 4 are all unique and distinct elements of the set. If we add element 4 again, the set becomes [1, 2, 3, 4, 4]. In this set, the elements 1, 2, 3, and 4 are still distinct, but 4 is no longer unique because it appears twice in the set.
Finding unique/distinct elements in data
In the following lesson, we'll be provided with data. Now on that data, we need to process and calculate all its unique and distinct elements. Along with computing distinct elements, we need to compute each element's frequency and show its histogram (where each element's frequency is shown in terms of a number of asterisks).
Sample program
The data:
D: = {
1 2 3 4 3 5 6 1 2 2
9 9 2 7 7 8 2 3 1 2
3 1 5 6 9 1 2 3 1 5
6 1 1 2 9 9 2 7 7 8
2 3 1 2 3 5 5 6 9 1
2 3 8 5 6 1 1 2 9 9
2 7 7 8 3 3 1 2 3 6
5 6 9 1 2 3 2 5 6 1
1 2 9 9 2 7 7 8 3 3
1 2 3 1 5 6 9
}
Unique:
Us: = {
4
}
Distinct:
Ds: = {
1 2 3 4 5 6 9 7 8
}
Frequencies:
Fs: = {
18 20 15 1 9 9 12 8 5
}
__________________________________________
The plot:
1 ****************** 18
2 ******************** 20
3 *************** 15
4 * 1
5 ********* 9
6 ********* 9
9 ************ 12
7 ******** 8
8 ***** 5
Instruction: Write the code in the following playground. The main()
function for testing is already provided.
#include <iostream>#include <math.h>using namespace std;void printSpecial(const char Msg[ ], int D[ ], int size);void findAllUniques(int D[ ], int N, int Us[ ], int &USize);void findAllDistincts(int D[ ], int N, int Ds[ ], int &DSize);void calculateDistinctsFrequecy(int D[ ], int N, int Ds[ ], int &DSize, int Fs[ ]);void displayDistinctsFrequecyGraph(int Ds[ ], int &DSize, int Fs[ ]);int main(){const int capacity = 100;int D[ ] = { 1,2,3,4,3,5,6,1,2,2,9,9,2,7,7,8,2,3,1,2,3,1,5,6,9,1,2,3,1,5,6,1,1,2,9,9,2,7,7,8,2,3,1,2,3,5,5,6,9,1,2,3,8,5,6,1,1,2,9,9,2,7,7,8,3,3,1,2,3,6,5,6,9,1,2,3,2,5,6,1,1,2,9,9,2,7,7,8,3,3,1,2,3,1,5,6,9},Us[capacity], USize, Fs[capacity], Ds[capacity], DSize;int N = sizeof(D)/sizeof(int); // instead of counting we will do the followingcout << "The data:\n";printSpecial("D:\t", D, N); cout << endl<<endl;findAllUniques(D, N, Us, USize);cout << "Unique:\n";printSpecial("Us:\t", Us, USize);findAllDistincts(D, N, Ds, DSize);cout << "Distinct:\n";printSpecial("Ds:\t", Ds, DSize);calculateDistinctsFrequecy(D, N, Ds, DSize, Fs);cout << "Frequencies:\n";printSpecial("Fs:\t", Fs, DSize);cout << "The plot:\n\n";cout << "__________________________________________";displayDistinctsFrequecyGraph(Ds, DSize, Fs);return 0;}int frequency(int D[ ], int N, int T){int f=0;for(int di=0; di<N; di++)if(D[di]==T)f++;return f;}void findAllUniques(int D[ ], int N, int Us[ ], int &USize){// Write code here.}void findAllDistincts(int D[ ], int N, int Ds[ ], int &DSize){// Write code here.}void printSpecial(const char Msg[ ], int D[ ], int size){cout << Msg << " = { ";for(int i=0; i<size; i++){cout << D[i] << " ";}cout << " }"<<endl;}void printASymbolKTimes(char sym, int k){for(int i=1; i<=k; i++)cout << sym;}void calculateDistinctsFrequecy(int D[ ], int N, int Ds[ ], int &DSize, int Fs[ ]){for(int di=0; di<DSize; di++){Fs[di] = frequency(D, N, Ds[di]);}}void displayDistinctsFrequecyGraph(int Ds[ ], int &DSize, int Fs[ ]){for(int di=0; di<DSize; di++){cout << Ds[di] <<"\t";printASymbolKTimes('*', Fs[di]);cout <<"\t"<<Fs[di] <<"\t"<< endl;}}
Finding the unique values in an array
In the given data, the elements with a frequency of only one are the unique values.
For finding all uniques let us make the following function:
void findAllUniques(int D[ ], int N, int Us[ ], int &USize)
This function will iterate through the D[]
array of N
size and compute all the unique values from the data and save them in Us[]
array and also save in the USize
how many values are saved inside Us[]
.
Implementation for finding uniques in the data
The idea is that for each element D[di]
in the data we look at its frequency in the entire data (by calling the frequency function) and if it only occurred once in the data we save it inside the unique set.
Here’s its implementation:
void findAllUniques(int D[ ], int N, int Us[ ], int &USize){USize=0;for(int di=0, ui=0; di<N; di++){if(frequency(D, N, D[di])==1) // D[i] is unique{Us[ui] = D[di], ui++; // Its shortform is Us[ui++] = D[di];USize++;// The above two instructions can be shortened as Us[USize] = D[di]}}}
Finding the distinct values in an array
In the data, the distinct values are basically the union of all the elements in the data. So, each value will appear exactly once.
Implementation of finding all distinct values in the data
For computing the distinct element, we will make an array of distinct elements Ds[]
. We will keep storing the distinct values inside Ds[]
so that for every element of the data, we will first check if D[i]
is not already present in the Ds[]
array (by checking D[i]
frequency inside Ds[]
to be zero). In that case, we will add the element D[i]
in Ds[i]
.
Here’s the implementation:
void findAllDistincts(int D[ ], int N, int Ds[ ], int &DSize){DSize=0;for(int di=0; di<N; di++){// If D[di] is appearing for the first time in Ds arrayif(frequency(Ds, DSize, D[di])==0){// add in distincts array DsDs[DSize] = D[di], DSize++;}}}
Exercise: Finding distinct values frequencies
To find the frequency of each distinct element in the data, we should make the following function:
void calculateDistinctsFrequency(int D[], int N, int Ds[ ], int &DSize, int Fs[ ]);
int D[]
is the data in which we have to search the frequency of each element.int N
is the size of the data.int Ds[]
is the array of the distinct values already computed byfindAllDistincts()
.int DSize
is the size ofDs[]
.int Fs[]
is the objective array we need to compute (withDSize
as size) such that eachFs[i]
should have the frequency ofDs[i]
in the dataD[]
.
For computing Fs[i]
, the frequency of the Ds[i]
element in the data we should call the frequency function.
Instruction: Write the implementation of this function in the above playground.
Exercise: Displaying distinct elements/frequency graph
Now, to display the frequency of distinct elements in terms of asterisks, we are required to make the following function:
void displayDistinctsFrequencyGraph(int Ds[ ], int &DSize, int Fs[ ]);
int Ds[]
is passed because of each distinct value frequency graph we need to display each distinct element.int DSize
is the size of the total distinct element in the data (andD[]
size too).int Fs[]
, holds the frequency of each distinct element in the data. The pair,Ds[i], Fs[i]
is the distinct element and frequency synchronized pair but stored in separate arrays.