Scraping Domains
In this lesson, you will build a proper automated scraper to store domain name data in a csv file.
We'll cover the following...
The website namebio.com will be scraped in this lesson. This site contains the domain name information like which domain is sold when, for what price, and where it is parked. In free mode, we can look at one-hundred data points with four columns of data which include the domain name, domain price, date it is sold, and the site where the domain is parked. The data is displayed and available in the form of a table like in the following image.
Writing the scraper
As can be seen in the above image, to get to the next table of domains, the next number button needs to be clicked and this is where the real power of selenium comes into play. Each and every button on the webpage can be clicked simply by finding its xpath
and clicking the button using the click()
function. Let’s see this in action.
# Import the require packagesfrom selenium import webdriverimport timeimport csv# Initialize the column names listcol = ['Name','Price','Date','Parked_At']# Write the column name list in the csv filewith open('Domain.csv', 'w') as csvFile:writer = csv.writer(csvFile)writer.writerow(col)chrome_options = webdriver.ChromeOptions()#chrome_options.add_argument("--headless")chrome_options.add_argument("--disable-extensions")chrome_options.add_argument("--disable-gpu")chrome_options.add_argument("--no-sandbox")# Initialize the browser objectdriver = webdriver.Chrome('<Provide the path of chromedriver>', options=chrome_options)# URL to go tourl = "https://namebio.com/"# Maximize the browser windowdriver.maximize_window()# Opens the URL in the browserdriver.get(url)# Counter to traverse through the 10 tablesi = 0while i < 10:# Get all elements with the table row tag using the xpath methodtrs = driver.find_elements_by_xpath('//*[@id="search-results"]/tbody/tr')# Traverse through the rowsfor tr in trs:# Split each row to get four valuestr = str(tr.text).split(' ')del tr[2:3]# Write the list of values to the csv filewith open('Domain.csv', 'a') as csvFile:writer = csv.writer(csvFile)writer.writerow(tr)# Close the csv filecsvFile.close()# When all rows are read then click the next arrow button to move to next tabledriver.find_element_by_xpath('//*[@id="search-results_next"]/a').click()# Give a 2 sec delay for the table to get loadedtime.sleep(2)# Increment the counteri+=1# Close the browserdriver.close()
The above code automates the ...