In psychology and the social sciences, Cronbach’s alpha is the most used indicator of scale reliability. No popular data science libraries, such as Sklearn, Pandas, or NumPy, offer Cronbach alpha measurements. Its range is between 0 and 1.
Learning how our clients feel about products in a business setting can be beneficial. Let’s say a company manager wants to assess customer satisfaction overall, so he sends a survey to 10 customers, and asks them to score the company on a scale of 1 to 3 for several areas. We’ll get this survey, make its data frame, and calculate Cronbach’s alpha to assess customers’ attitudes towards the product.
Internal consistency refers to how well a survey, poll, or test truly measures what we want it to evaluate. We can be more confident that our survey is reliable if the internal consistency improves.
Cronbach's Alpha | Internal Consistency |
0.9 ≤ α | Excellent |
0.8 ≤ α < 0.9 | Good |
0.7 ≤ α < 0.8 | Acceptable |
0.6 ≤ α < 0.7 | Questionable |
0.5 ≤ α < 0.6 | Poor |
α < 0.5 | Unacceptable |
The formula to calculate Cronbach’s alpha is as follows:
Where, N is the number of questions and r is the mean correlation
We can implement Cronbach’s alpha using the pingouin
library or by making its function without using the library, that is, from scratch.
pingouin
libraryWe can calculate Cronbach’s alpha using a library named pingouin
. For that, we have to install it first. We can use the following command to install it:
pip install pingouin
Let's look at the code below:
# Importing librariesimport pandas as pdimport pingouin as pg# Enter survey responses of a product as a Dataframedata = pd.DataFrame({'P1': [1, 2, 2, 3, 1, 2, 3, 3, 2, 3],'P2': [1, 1, 1, 2, 1, 3, 2, 3, 3, 3],'P3': [1, 1, 2, 3, 1, 3, 3, 3, 2, 3]})# View the above Dataframeprint(data)# Calling cronbach_alpha to calculate reliabilitypg.cronbach_alpha(data=data)
Pandas
library.Note: The output array represents the
lower and upper bound. If we repeat our test, we can expect the estimate to fall between these numbers with a reasonable level of certainty. confidence interval’s The mean of our estimate plus and minus the range of that estimate forms a confidence interval.
Let's look at the code below:
# Importing librariesimport pandas as pdimport numpy as npdef cronbach_alpha(data):# Transform the data frame into a correlation matrixdf_corr = data.corr()# Calculate N# The number of variables is equal to the number of columns in the dataframeN = data.shape[1]# Calculate r# For this, we'll loop through all the columns and append every# relevant correlation to an array called 'r_s'. Then, we'll# calculate the mean of 'r_s'.rs = np.array([])for i, col in enumerate(df_corr.columns):sum_ = df_corr[col][i+1:].valuesrs = np.append(sum_, rs)mean_r = np.mean(rs)# Use the formula to calculate Cronbach's Alphacronbach_alpha = (N * mean_r) / (1 + (N - 1) * mean_r)return cronbach_alpha# Calling function to the calculate value of Cronbach's alphacronbach_alpha(data)
Numpy
to operate arrays and Pandas
to manipulate tabular data.The value of Cronbach’s alpha on our survey is 0.8960
, so we can say that our internal consistency of this survey is “Good.”
Free Resources