Why mutability is essential for real-time data analytics

Click for: original source

Successful data-driven companies like Uber, Facebook and Amazon rely on real-time analytics. Personalizing customer experiences for e-commerce, managing fleets and supply chains, and automating internal operations all require instant insights on the freshest data. By Dhruba Borthakur.

One of the technologies I founded was open source RocksDB, the high-performance key-value engine used by MySQL, Apache Kafka and CockroachDB. RocksDB’s data format is a mutable data format, which means that you can update, overwrite or delete individual fields in a record. It’s also the embedded storage engine at Rockset, a real-time analytics database I founded with fully mutable indexes.

The article contains good information on:

  • Differences between mutable and immutable data
  • The historic usefulness of immutability
  • The problems with immutable data
  • Mutability aids machine learning
  • How mutability enables real-time analytics

To sum up, mutability is key for today’s real-time analytics because event streams can be incomplete or out of order. When that happens, a database will need to correct and backfill missing and erroneous data. To ensure high performance, low cost, error-free queries and developer efficiency, your database must support mutability.

[Read More]

Tags analytics database big-data cio data-science search