Preparing to Scrape
Let's see how we can scrap a webpage.
We'll cover the following
Figure out the purpose of scrapping
Before we can start scraping, we need to figure out what we want to do.
We will be using my blog for this example. We can use
Python’s urllib2
module to download the HTML that we need to parse
or we can use the requests
library. For this example, we’ll be using
requests
.
Most websites nowadays have pretty complex HTML. Fortunately, most browsers provide tools to figure out the complexities of website elements. For example, if we open the blog in chrome, we can right-click on any of the article titles and click the Inspect menu option (see below):
Get hands-on with 1400+ tech skills courses.