This device is not compatible.

Learn Topic Modeling with LDA and LSA Models

PROJECT


Learn Topic Modeling with LDA and LSA Models

In this project, we’ll learn how to turn text documents into insights with text modeling using LDA and LSA models.

Learn Topic Modeling with LDA and LSA Models

You will learn to:

Exercise the basics of NLP.

Clean the text data for NLP models.

Generate a dataset from PDF files.

Train, evaluate, and fine-tune NLP models.

Implement topic modeling using LDA and LSA models.

Skills

Natural Language Processing

Topic Modeling

Data Analysis

Data Visualization

Machine Learning

Prerequisites

Basic Python programming skills

Familiarity with NLP concepts

Basic understanding of machine learning

Technologies

Python

Gensim logo

Gensim

Matplotlib

Project Description

Data with valuable insights can come in many forms, including unlabeled text data. In many practical cases, working with text data leaves us with massive amounts that are most likely unlabeled, messy, and difficult to decipher. In natural language processing, topic modeling is an unsupervised statistical method for extracting abstract topics from a collection of documents. 

In this project, we will follow step-by-step tutorials and explanations on data collection, data conversion, data cleaning, and topic model implementation. We will use the Federal Open Market Committee resource materials as our dataset to demonstrate how real-world data can be much more difficult than other curated datasets. We will learn how to implement topic models such as latent Dirichlet allocation (LDA) and latent semantic analysis (LSA) using Python libraries like gensim. This project will equip us with the basic knowledge and skills to collect, clean, and tackle almost any text data of interest. 

Project Tasks

1

Introduction

Task 0: Get Started

Task 1: Import the Libraries

2

Data Preprocessing

Task 2: Examine the Data

Task 3: Store PDFs as Text Files

Task 4: Process the Text Data

3

Model Implementation

Task 5: Analyze the Common Words

Task 6: Process the Data

Task 7: Fit the Model

4

Model Tuning

Task 8: Analyze the Model Output

Task 9: Compute Coherence Scores

Task 10: Analyze Coherence Scores and Tune the Models

5

Model Application

Task 11: Apply the LDA Model to Dataset

Task 12: Execute Time Series EDA

Congratulations!

has successfully completed the Guided ProjectLearn Topic Modeling with LDA and LSA Models

Relevant Courses

Use the following content to review prerequisites or explore specific concepts in detail.