...

Solution Review: Find the Highest Achiever

In this review, we give a detailed analysis of the solution to the problem of finding the highest achiever.

We'll cover the following...

Solution: Merging Data Frames
- Explanation
  - Steps Performed:

Press + to interact

Files

findTopper <- function() 
{
    findIndexWithMaxNum <- function(myVector) # helper function that returns the index 
                                        # of the element that has the highest value
    {
        maxNumber = -Inf # We want this to be the lowest possible value for comparison
        maxIndex = 0
        index = 1
        for(i in myVector)
        {
            if(maxNumber < i)
            {
                maxNumber = i # set the max element
                maxIndex = index # set the max element's index
            }
            index = index + 1
        }
        return(maxIndex) # return the index of the max element
    }
    # MAIN FILE HANDLING CODE
    mathData = read.csv("math.csv") # fetch data from math.csv
    englishData = read.csv("english.csv") # fetch data from english.csv
    scienceData = read.csv("science.csv") # fetch data from science.csv
    tempData <- merge(mathData, englishData) # we use the merge function on data frames
    finalData <- merge(tempData, scienceData) # another merge function to merge the remaining data frame
    print(finalData)
    result <- vector("numeric", 0) # vector to store the total marks of each student
    for(student in 1:length(finalData)) # loop over all the rows/students
    {
        temp <- 0.0 # temporarily stores the total marks of the current student
        for(i in 2:ncol(finalData)){ # loop over all the columns (math, english, science)
        # We iterate from 2 to ncol(finalData) because the 1st column is just names of students
            temp <- temp + as.double(finalData[student, i]) # fetch respective student's marks
        }
        result <- c(result, temp) # store the total marks of the current student
    }
    return(findIndexWithMaxNum(result)) # return the index of the highest scoring student
}
# Driver Code
findTopper()

Explanation

The code starts executing from line number 47 when the function findTopper() is executed.

This function starts executing from line number 22 (main file handling code)

Steps Performed:

Line number 23-25: Read all the subject files in the variables: mathData, englishData and scienceData respectively. These will also act as the data frames.

Remember, data fetched from a .csv is already in the form of a data frame.

Line number 27-28: Merge the three data frames into one data frame. In the code snippet above, we have broken merging of the three data frames into two steps. First, merge mathData and englishData and save in tempData. Then merge tempData and scienceData in finalData.
Now that we have all the data compiled in one data frame finalData, we can begin performing our analysis on it.
Line number 32-43: We use nested for loop to iterate over the whole data frame. The outer loop:

for(student in 1:length(finalData))

keeps track of the rows/students. Since the value of length(finalData) is $4$ we are basically executing loop from student $1$ to student $4$ .

The inner loop:

for(i in 2:ncol(finalData))

iterates over all the columns (math, english, science). Notice, we iterate from column $2$ to ncol(finalData) because the $1st$ column is just names of students. Then we add the marks of all subjects of each student. The loop can be illustrated as follows:

Introduction to R

R variables

Data Structures in R

Operator in R

Conditional Statements in R

Loops in R

Function in R

Input/Output in R

Exception Handling in R

Classes in R

R Programming Challenges

Conclusion

Solution Review: Find the Highest Achiever

Solution: Merging Data Frames

Explanation

Steps Performed: