Web scraping with Golang

Click for: original source

Nano Dano wrote this lengthy article about web scraping with golang. It can be useful in a variety of situations, like when a website does not provide an API, or you need to parse and extract web content programmatically. Tutorial walks through using the standard library to perform a variety of tasks like making requests, changing headers, setting cookies, using regular expressions, and parsing URLs.

Before doing any web scraping, it is important to understand what you are doing technically.

You will learn how to use:

  • Go – the Go programming language (tested with 1.6)
  • goquery (for some examples) – Go version of jQuery for DOM parsing

Then author goes over (in great detail with code examples) all topics needed for successful scrapping:

  • How to make an HTTP GET request
  • Make an HTTP GET request with timeout
  • Set HTTP headers (Change user agent)
  • Download a URL
  • Use substring matching to find page title …

And much more. Article also contains useful links to help you with Go installation etc. Excellent!

[Read More]

Tags programming golang web-development