...

/

Morale Function and Model Error

Morale Function and Model Error

Learn about the true (morale) function and the total (model) error.

Let’s use the interp1d module to create an interpolation function to fill in the week and morale points. We are using the kind='cubic' parameter that indicates the smoothing for our interpolation.

True function

In the code example below, week_points and morale_points have values, which interp1d uses to approximate the morale_func function that gives the true values in morale_true.

Press + to interact
morale_func = interp1d(x=week_points, y=morale_points, kind='cubic')
# try kind = 'linear' (default) and see the difference!
morale_true = morale_func(days) # 84 days....check above
print("Number of datapoints in 'moral_true' '{}' for days'{}'".format(len(morale_true), len(days)))
# print(morale_true)

Let’s plot the data points that we used in interp1d on the left and the interpolated values on the right for our true function for days predicting morale. The trends should be the same in both plots, but the one after interpolation should be smoother with morale against each day.

Press + to interact
# try this, the trend is the same as in the next plot! but with missing points, right?
# plt.plot(week_points, morale_points);
# At this stage, this code below should be self-explanatory!
fig1 , (ax1, ax2) = plt.subplots(ncols=2,figsize=(16,6), sharey=True)
# Available data
ax1.plot(week_points, morale_points, lw=5.0, c='r', alpha=0.7, label='true function')
ax1.scatter(week_points, morale_points, s = 100)
# Interpolated data
ax2.plot(days, morale_true, lw=5.0, c='r', alpha=0.7, label='true function')
ax2.scatter(days, morale_true, s = 100)
# Setting title, labels ...... etc!
ax1.set_title('\nMorale over time (available data)\n')
ax2.set_title('\nMorale over time (interpolated data)\n')
ax1.set_xlabel('days\n')
ax2.set_xlabel('days\n')
ax1.set_ylabel('morale\n')
ax1.legend(loc=2) # 'upper left'
ax2.legend(loc=2)#'upper left'
plt.tight_layout();

Our true function for morale can have the following interpretations.

  • With no measurement error:

    • All students may have the same morale at every time point (day), and the function represents no measurement error in the morale at the given time or day for any student. This is a situation where our measurement tool or survey was perfect, and we measured the same morale for every student at each time point.

    • What if there were some unavoidable issues in the instrument that randomly added some noise in each observation?

    • What if some external parameter (weather) affected the measurement tool someday and added an error (unavoidable and irreducible)?

  • With no individual variance:

    • We can interpret that our true function is the baseline morale for each time point, and all students vary around this function to some degree (±). A student’s morale at any given time point is baseline ± deviation. However, there is no individual variance in a student (for a particular student, the morale line is just an offset ± from the baseline). This might mean the variance is biased to the baseline.

    • Just a heads-up, while generating the data for an individual student, we’ll add some random noise in the true function to create some individual variance.

  • Average or mean across infinite students:

    • Our measurements of morale vary at each time point for an individual student. Still, if we had an infinite number of students and averaged all their morale measurements across all time points, we would have the true function of morale. We might need to factor in high variance or not being able to quantify the relationship.

In the situations above, we are trying to interpret morale as a function of time with no error. However, each situation is a different source of error.

  1. Irreducible error: Occurs from an imperfect ability to measure morale because of some unavoidable reasons.

  2. Bias error: Occurs from an imperfect relationship between time and morale.

  3. Variance error: Occurs from an insufficient amount of good data that can correctly quantify the relationship(s).

These sources of errors combine, resulting in the final error in our trained model.

Note: We always have errors in our models. However, it depends on how much and what proportion of each type. We can play with bias and variance to find the sweet spot. However, we can’t do anything about the irreducible error.

Total error

There are three sources of error in a model:

We merely try to pin down where these different contributions are coming from in our model’s error and look for the average value that we expect to observe for the error (MSE) measured across all samples and all data points given to a particular model for training. Want to know a little more about the relationship above? Here is the typical formula, ...

Access this course and 1400+ top-rated courses and projects.