Gain insights into data science with easy-to-follow, hands-on explanations. Explore essential concepts quickly and efficiently, even without prior statistics knowledge, for a career boost.

GrokkingDataScientist.tar.gz

jupyter_job

Master the skills that can get you a $100K+ salary even if you bunked your statistics classes. 

No need to waste hours and hours on browsing from one article to the next and piecing together the info you need to grasp important topics. No need to get overwhelmed by the information overload. Find easy to follow, hands-on, and fun explanations of all the essential topics in one place so you can quickly and efficiently learn what you need to thrive as a data scientist.

"Is this course right for me?" Continue to read to decide for yourself!

- "I want to understand this data science concept. Let me Google it". Then after hours of surfing, reading random articles, and invoking the heavens, you are more confused than before.
- "Data science is the sexiest and highest paying job of the 21st century. I want to become a data scientist too".
- "I have a basic knowledge of Python, willingness to learn, and commitment to become a great data scientist."

Is that you? If yes, you are at the right place.

Grokking Data Science

### Introduction

We have learned that probability gives us the percent chance of an event occurring. Now, what if we want an understanding of the probabilities of all the possible values in our experiment? This is where probability distributions come into play. 

A probability distribution is a function that represents the probabilities of all possible values. This is a very important concept in data science, by specifying the relative chance of all possible outcomes. Probability distributions allow us to understand the underlying trends in our data. For example, if we have some missing values in our dataset, we can understand the distribution of our data using probability distributions and then replace missing values with the most likely values.
 
### Random Variables
For the next couple of lessons, we are going to look at some of the most important probability distributions. But before we dive into probability distributions, we need to understand the different types of data we can encounter. 

The set of possible values from a random experiment is called a **Random Variable**. Random Variables can be either discrete or continuous:

- **Discrete Data** (a.k.a. discrete variables) can only take specified values. For example, when we roll a die, the possible outcomes are 1, 2, 3, 4, 5, or 6 and not 1.5 or 2.45.
- **Continuous Data** (a.k.a. continuous variables) can take any value within a range. This range can be finite or infinite. Continuous variables are measurements like height, weight, and temperature. 


### Types of Probability Distributions
Since probability distributions describe the distribution of the values of a random variable, the kind of variable determines the type of probability distribution we are dealing with. This means that probability distributions can be divided into the following two types:

- Discrete probability distributions for discrete variables
- Probability density functions for continuous variables



# Introduction

We have learned that probability gives us the percent chance of an event occurring. Now, what if we want an understanding of the probabilities of all the possible values in our experiment? This is where probability distributions come into play. 

A probability distribution is a function that represents the probabilities of all possible values. This is a very important concept in data science, by specifying the relative chance of all possible outcomes. Probability distributions allow us to understand the underlying trends in our data. For example, if we have some missing values in our dataset, we can understand the distribution of our data using probability distributions and then replace missing values with the most likely values.
 
# Random Variables
For the next couple of lessons, we are going to look at some of the most important probability distributions. But before we dive into probability distributions, we need to understand the different types of data we can encounter. 

The set of possible values from a random experiment is called a **Random Variable**. Random Variables can be either discrete or continuous:

- **Discrete Data** (a.k.a. discrete variables) can only take specified values. For example, when we roll a die, the possible outcomes are 1, 2, 3, 4, 5, or 6 and not 1.5 or 2.45.
- **Continuous Data** (a.k.a. continuous variables) can take any value within a range. This range can be finite or infinite. Continuous variables are measurements like height, weight, and temperature. 


# Types of Probability Distributions
Since probability distributions describe the distribution of the values of a random variable, the kind of variable determines the type of probability distribution we are dealing with. This means that probability distributions can be divided into the following two types:

- Discrete probability distributions for discrete variables
- Probability density functions for continuous variables



Python Fundamentals for Data Science

The Fundamentals of Statistics

Machine Learning 101

End-to-End Machine Learning Project

The Real Talk

Probability Distributions - An Introduction

Introduction