Search⌘ K

Project 2: Parsing data from a HTML file with Python and REGEX

Explore how to use Python's regular expressions to parse and extract tabular information from HTML files. This lesson helps you understand the basics of locating and retrieving specific data patterns, preparing you for real-world data scraping projects.

We'll cover the following...

In this project, we want to extract tabular information from a HTML file (see below). Our goal is to extract information available between and except the first numerical index (1..6).

Consider the data.html file below:

<html>
<head>
<style>
table, th, td {
   
...