Narrowing in on the Data
Understand how to filter and extract meaningful data from HTML tables during web scraping in Python. This lesson helps you identify relevant table rows, handle blank cells and unusual characters, and organize the scraped data into structured lists for further analysis.
We'll cover the following...
We'll cover the following...
Getting exact rows
You are honing in on the information that you want. Let’s take a look at where you are:
Since you can see the <tr> tag first (which means table row), that means that you can now start honing in on how we can filter out the blanks. The next step is to print the length of each tr:
table = soup.find('table', class_="large tablesorter")
for row in table:
for tr in row:
print(len(tr), tr)
Again, looking at the ...