Web scraping with Puppeteer and Chrome Headless

Click for: original source

Emad Ehsan put together article about how to get started with Web Scraping in Chrome Headless. Chrome Headless is going to be industry leader in Automated Testing of web applications. Puppeteer is the official tool for Chrome Headless by Google Chrome team.

In this guide author will teach you how to:

  • scrape GitHub,
  • login to it,
  • extract and save public emails of users
  • while usin Chrome Headless, Puppeteer, Node.js and MongoDB.

You will get a good idea on Scraping with Chrome Headless and Node.js. Interesting info: Chrome Headless also supports WebGL. The article is detailed and with many screen-grabs documenting each step. And of course the code is available in GitHub repository.

[Read More]

Tags web-development big-data machine-learning