This device is not compatible.
PROJECT
Web Scraping Using Selenium in Python
In this project, we’ll scrape the Wikipedia website using different tools provided by the Selenium library in Python. We’ll master the techniques of fetching data using multiple Selenium commands in the form of HTML elements. Lastly, we’ll learn to automate the events on a web page.
You will learn to:
Understand the fundamentals of Selenium methods.
Automate the events on a webpage using Selenium.
Use regex for text cleaning.
Create Python dictionaries from scraped data.
Skills
Web Scraping
Python Programming
HTML Elements
Prerequisites
Basic understanding of the Python language
Basic understanding of the Selenium library
Basic understanding of the Python regex library
Basic understanding of Python dictionaries
Technologies
CSS
HTML
Python
Selenium
Project Description
In this project, you’ll use the Selenium library in Python to scrape data from a website. You’ll scrape the data from Wikipedia, the fastest growing free online encyclopedia.
Throughout this project, you’ll use multiple Selenium commands to fetch HTML elements. You’ll fetch elements using the following attributes:
- CSS class names
- CSS IDs
- HTML tag names
- Link texts
- Texts
- Nested CSS selectors
- Attributes
Furthermore, you’ll use multiple Selenium events to automate the processes on this website. Finally, you’ll use the regex
library to clean the text data.
Project Tasks
1
Initial Setup
Task 1: Get Started
Task 2: Navigate to the Web Page
2
Scrape the Data
Task 3: Fetch an Element Using ID
Task 4: Fetch an Element Using Its Class Name
Task 5: Switch to the New Window
Task 6: Fetch an Element Using Link Text
Task 7: Fetch Elements By a Tag Name
Task 8: Extract the Text from Elements
Task 9: Remove Stop Words from the Text
Task 10: Fetch Nested Elements
Congratulations!
Relevant Courses
Use the following content to review prerequisites or explore specific concepts in detail.