This device is not compatible.
You will learn to:
Exercise the basics of NLP.
Clean the text data for NLP models.
Generate a dataset from PDF files.
Train, evaluate, and fine-tune NLP models.
Implement topic modeling using LDA and LSA models.
Skills
Natural Language Processing
Topic Modeling
Data Analysis
Data Visualization
Machine Learning
Prerequisites
Basic Python programming skills
Familiarity with NLP concepts
Basic understanding of machine learning
Technologies
Python
Gensim
Matplotlib
Project Description
Data with valuable insights can come in many forms, including unlabeled text data. In many practical cases, working with text data leaves us with massive amounts that are most likely unlabeled, messy, and difficult to decipher. In natural language processing, topic modeling is an unsupervised statistical method for extracting abstract topics from a collection of documents.
In this project, we will follow step-by-step tutorials and explanations on data collection, data conversion, data cleaning, and topic model implementation. We will use the Federal Open Market Committee resource materials as our dataset to demonstrate how real-world data can be much more difficult than other curated datasets. We will learn how to implement topic models such as latent Dirichlet allocation (LDA) and latent semantic analysis (LSA) using Python libraries like gensim. This project will equip us with the basic knowledge and skills to collect, clean, and tackle almost any text data of interest.
Project Tasks
1
Introduction
Task 0: Get Started
Task 1: Import the Libraries
2
Data Preprocessing
Task 2: Examine the Data
Task 3: Store PDFs as Text Files
Task 4: Process the Text Data
3
Model Implementation
Task 5: Analyze the Common Words
Task 6: Process the Data
Task 7: Fit the Model
4
Model Tuning
Task 8: Analyze the Model Output
Task 9: Compute Coherence Scores
Task 10: Analyze Coherence Scores and Tune the Models
5
Model Application
Task 11: Apply the LDA Model to Dataset
Task 12: Execute Time Series EDA
Congratulations!
Relevant Courses
Use the following content to review prerequisites or explore specific concepts in detail.