Statistics
The statistics questions and answers in this lesson will help you understand the types of statistics questions you can expect in data science interviews.
We'll cover the following...
What is the difference between overfitting and underfitting?
Most people think that model fitting is a machine-learning concept, but it is not true. Model fitting is an old statistical concept widely used in machine learning. To create models, we divide the data set into two parts: the train dataset and the test dataset. The test dataset is also called new data. So, don't get confused by the terms. We create a model on the train data set and test the model on the test data set.
The main difference between overfitting and underfitting is that overfitting has a high-accuracy model on the training data set. Still, it does not perform well on the test data set. Underfitting has a low-accuracy model, so it won't perform well on the test data set.
Which would you select, a linear regression model, R-Square, or adjusted R-square?
R-square and adjusted R-square are two model accuracy measures for linear regression. The R-square will always increase if we add more independent variables to the model. However, adjusted R-square will only increase if the newly added variable improves the model's accuracy. ...