Narrowing in on the Data

Drill down into the table and turn table rows into lists.

We'll cover the following...

Getting exact rows

You are honing in on the information that you want. Let’s take a look at where you are:

Press + to interact
import requests
from bs4 import BeautifulSoup
def scrape_website(address: str) -> str:
"""
Scrape the properties website and return the response text
:param address: URL of website to scrape
:return: str as response.text
"""
headers = {'user-agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:74.0) Gecko/20100101 Firefox/74.0"}
r = requests.get(address, headers=headers)
return r.text
url = "https://www.engineeringtoolbox.com/properties-aluminum-pipe-d_1340.html"
website_text = scrape_website(url)
soup = BeautifulSoup(website_text, 'lxml')
table = soup.find('table', class_="large tablesorter")
for row in table:
for index, tr in enumerate(row):
print(len(tr), tr)

Since you can see the <tr> tag first (which means table row), that means that you can now start honing in on how we can filter out the blanks. The next step is to print the length of each tr:

table = soup.find('table', class_="large tablesorter")
for row in table:
    for tr in row:
        print(len(tr), tr)

Again, ...