...
/Solution Review: Scrape the Web Page Using Beautiful Soup
Solution Review: Scrape the Web Page Using Beautiful Soup
Review the solution for the book information scraping task.
We'll cover the following...
Solution
We start by inspecting the web page and finding the elements we want.
Press + to interact
Press + to interact
import requestsfrom requests.compat import urljoinfrom bs4 import BeautifulSoupbase_url = "https://books.toscrape.com/"titles = []images = []rates = []prices = []# Solutionresponse = requests.get(base_url)soup = BeautifulSoup(response.content, 'html.parser')articles = soup.find_all("article", {"class":"product_pod"})for article in articles:image = urljoin(base_url,article.find("div", {"class":"image_container"}).a.img['src'])rate = article.find("p", {"class":"star-rating"})['class'][1]title = article.find("h3").a['title']price = article.find("div", {"class":"product_price"}).p.stringtitles.append(title)images.append(image)rates.append(rate)prices.append(price)print("Length of scraped titles: ", len(titles))print("Length of scraped images: ", len(images))print("Length of scraped rates: ", len(rates))print("Length of scraped prices: ", len(prices))print(titles)
Code explanation
Lines 13–14: We request the site URL using
request.get()
and pass theresponse.content
toBeautifulSoup()
. ...