...
/Scraping Yahoo Finance with Selenium
Scraping Yahoo Finance with Selenium
Learn how to scrape financial data from Yahoo using Selenium.
We'll cover the following...
Having acquired knowledge about Selenium, let's utilize this understanding to extract financial news data from Yahoo Finance. Yahoo is notorious for incorporating JavaScript on its website, rendering traditional scraping techniques ineffective.
To begin, we'll focus on extracting the first rendered news of stock market news:
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=options) driver.get("https://finance.yahoo.com/topic/stock-market-news/") try: news = WebDriverWait(driver, 10).until( EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'li[class="stream-item story-item yf-1usaaz9"]'))) except TimeoutException: raise TimeoutException("Elements are not loaded") print("len of news: ", len(news)) data = [] for n in news: title = n.find_element(By.CSS_SELECTOR, "section div h3").text link = n.find_element(By.CSS_SELECTOR, "section div a").get_attribute("href") d = {'title': title, 'link': link} data.append(d) print("len of scraped data: ", len(data)) print("sample: ", data[0]) # We are using this only for demonstration purpose. time.sleep(2) driver.close()
Note: In the provided code, there is a hidden section that handles the imports and options for the driver, as covered earlier. For the purpose of this lesson, we will solely focus on the scraping part.
Lines 1–2: We initialize the web driver and make a
GET
request to the subreddit URL.Lines 5–9: Using the CSS path selector
li[class="stream-item story-item yf-1usaaz9"]
...