What is the Bootstrap method in data science?

Bootstrapping is a technique used to make estimations from data by taking an average of the estimates from smaller data samples.

Method

The bootstrap method involves iteratively resampling a dataset with replacement. Instead of only estimating our statistic once on the complete data, we can do it many times on a re-sampling (with replacement) of the original sample. Repeating this re-sampling multiple times allows us to obtain a vector of estimates. We can then compute variance, expected value, empirical distribution, and other relevant statistics of these estimates.

svg viewer

Uses

Bootstrapping allows statistical inferences to be made about the population distribution of a small sample. It is used to account for distortions caused by certain sample data that could make for a bad representation of the overall data.


Copyright ©2024 Educative, Inc. All rights reserved