What is the difference between micro and macro averaging?

Introduction

Micro average is concerned with the study of individual classes. For example, a firm’s micro average examines personal goods’ price and market structure.

Macro average is concerned with aggregations or totals. We study economy-wide phenomena such as unemployment, price levels of economic growth, and gross domestic product or GDP using macro average.

The difference between macro and micro averaging is that macro averaging gives equal weight to each category while micro averaging gives equal weight to each sample. If we have the same number of samples for each class, both macro and micro will provide the same score.

Calculation of micro and macro averaging

Let's suppose we have a multi-class classification system with three unbalanced classes. Here is the possible confusion matrix for these classes.

PrecisionMicroAvg $= \frac{(𝑇𝑃_1 + 𝑇𝑃_2 + … + 𝑇𝑃_𝑛)}{(𝑇𝑃_1+𝑇𝑃_2+ … + 𝑇𝑃_𝑛 + 𝐹𝑃_1 + 𝐹𝑃_2 + … + 𝐹𝑃_𝑛)}$

PrecisionMicroAvg $= \frac{(10 + 8 + 9)}{(10 + 8 + 9 + 2 + 1 + 2)} = \frac{(27)}{(32)} = 0.84$

RecallMicroAvg $= \frac{(𝑇𝑃_1 + 𝑇𝑃_2 + … + 𝑇𝑃_𝑛)}{(𝑇𝑃_1+𝑇𝑃_2+ … + 𝑇𝑃_𝑛 + 𝐹N_1 + 𝐹N_2 + … + 𝐹N_𝑛)}$

RecallMicroAvg $= \frac{(10 + 8 + 9)}{(10 + 8 + 9 + 3 + 2 + 3)} = \frac{(25)}{(40)} = 0.625$

Micro averaging

Micro average statistics evaluate models trained for multi-class classification problems.
We use micro averaging when we need an equal weight for each instance or prediction.
The micro average precision is the sum of true positives for a single class divided by the sum of predicted positives for all classes.
The micro average recall score is the sum of true positives for a single class divided by the sum of true positives for all classes.

Macro averaging

Macro average statistics evaluate models trained for multi-class classification problems.
We use macro averaging in case of a class unbalance (different number of instances associated with different class labels).
The macro average is the arithmetic mean of the individual class related to precision, memory, and f1 score.
We use macro average scores when we need to treat all classes equally to evaluate the overall performance of the classifier against the most common class labels.