urllib.robotparser
Let's explore urllib.robotparser and its use.
We'll cover the following
Overview
The robotparser
module is made up of a single class,
RobotFileParser
. This class will answer questions about whether or
not a specific user agent can fetch a URL that has a published
robot.txt
file. The robots.txt
file will tell a web scraper or robot
what parts of the server should not be accessed. Let’s take a look at a
simple example using ArsTechnica’s website:
Get hands-on with 1400+ tech skills courses.