What are datasets in ML?

Popular datasets

Some of the popular datasets used in applications of machine learning, deep learning, and data science are listed below:

MNIST dataset

This is a dataset of handwritten digits containing a sample of 70,000 examples. We can use this dataset to learn image classification and simple pattern recognition.

The dataset can be found herehttp://yann.lecun.com/exdb/mnist/.
Sentiment140

This dataset contains tweets data. We can use it for sentiment analysis. It is 160,000 records with six features. This dataset can be used for natural language processing.

The dataset can be found herehttps://www.kaggle.com/datasets/kazanova/sentiment140.
Credit card fraud detection

This dataset contains 284,807 credit card transactions with labels. We can use this dataset to build a model for detecting fraudulent activity.

The dataset can be found herehttps://www.kaggle.com/datasets/mlg-ulb/creditcardfraud.
IRIS dataset

This dataset contains information about petal and sepal width in flowers. It includes three classes with 50 entries each. We use this dataset for learning pattern recognition.

The dataset can be found herehttps://archive.ics.uci.edu/ml/datasets/Iris.

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments

TRENDING TOPICS

Learn to Code

Tech Interview Prep

Generative AI

Data Science

Machine Learning

GitHub Students Scholarship

Early Access Courses

Blind 75

Layoffs

Pricing

For Individuals

Try for Free

Gift a Subscription

CONTRIBUTE

Become an Author

Become an Affiliate

Earn Referral Credits

RESOURCES

Blog

Cheatsheets

Webinars

Answers

ABOUT US

Our Team

Careers

Hiring

Frequently Asked Questions

Press

LEGAL

Cookie Policy

Business Terms of Service

Data Processing Agreement

INTERVIEW PREP COURSES

Grokking the Modern System Design Interview

Grokking the Product Architecture Design Interview

Grokking the Coding Interview Patterns

Machine Learning System Design

What are datasets in ML?

Overview

Sources of datasets

Popular datasets