Architecture of a large-scale web search engine, circa 2019

Click for: original source

This is a long overview of bunch of search engines and their architecture. By Tech & Cliqz on

It is important to understand that a web scale search engine is highly complex. It is a distributed system with strong constraints on performance and latency. On top of that it can easily become extremely costly to operate; both in human resource and, of course, in money.

This article explores the technology stack we employ today and some of our choices and decisions, which have been taken and iterated upon over the years, to cater both external and internal users.

The article contains multiple sections:

  • Our search experience—dropdown & SERP
  • Fully automated and near real-time search
  • Deployments -— a historical context
  • Intricacies of a search system
  • Docker containers and container orchestration system
  • Local development with tilt—an end to end use case

… and much more. This really is an in depth article, with plenty of references to open source software used, tooling and further reading. Really great!

[Read More]

Tags search software-architecture containers kubernetes