...

/

Extract Book Categories

Extract Book Categories

Learn how to scrape the book categories from our target “Books to Scrape" website.

Categories are listed in the side panel of the home page. To scrape these categories, we first need to understand how they are structured in the HTML.

Investigating the HTML structure

The easiest way to locate elements in the HTML structure is to use the web browser's capabilities. Let's use the browser dev tools to locate this quickly, as shown below:

Press + to interact
Right click and select inspect
1 / 3
Right click and select inspect

Now we know where this category list is created as an unordered list in the DOM tree. If we select this ul element, we get access to all its children, which are li elements containing the category names inside it. So, our next step would be defining a selector for that. We can either use our knowledge or use the browser to generate them.

Define selector

Let’s apply the knowledge we gathered to develop a selector ourselves. The diagram below shows how the elements are structured in the DOM ...