This device is not compatible.

Scraping Wikipedia Using Selenium in Python

PROJECT


Scraping Wikipedia Using Selenium in Python

In this unguided project, we’ll scrape the Wikipedia website using different tools provided by the Selenium library in Python. We’ll master the techniques of fetching HTML data using multiple Selenium commands. Lastly, we’ll learn to automate the events on a web page.

Scraping Wikipedia Using Selenium in Python

You will learn to:

Understand the fundamentals of Selenium methods.

Automate the events on a web page using Selenium.

Use regex for text cleaning.

Create Python dictionaries from scraped data.

Skills

Web Scraping

Python Programming

HTML Elements

Prerequisites

Basic understanding of the Python language

Basic understanding of the Selenium library

Basic understanding of the Python regex library

Basic understanding of Python dictionaries

Technologies

CSS

HTML

Python

Selenium

Project Description

In this unguided project, we’ll use the Selenium library in Python to scrape data from Wikipedia, the fastest growing free online encyclopedia. Throughout this project, we’ll use multiple Selenium commands to fetch HTML elements using the following attributes:

Throughout this project, you’ll use multiple Selenium commands to fetch HTML elements. You’ll fetch elements using the following attributes:

  • CSS class names
  • CSS IDs
  • HTML tag names
  • Link texts
  • Texts
  • Nested CSS selectors
  • Attributes

Furthermore, we’ll use multiple Selenium events to automate the processes on this website. Finally, we’ll use the regex library to clean the text data.

Project Tasks

1

Initial Setup

Task 1: Get Started

Task 2: Navigate to the Web Page

2

Scrape the Data

Task 3: Perform the Search Operation

Task 4: Fetch an Element Using Link Text

Task 5: Fetch Elements Using the Tag Name

Task 6: Fetch Nested Elements

Congratulations!

has successfully completed the Guided ProjectScraping Wikipedia Using Selenium in Python

Relevant Courses

Use the following content to review prerequisites or explore specific concepts in detail.