How Uber migrated financial data from DynamoDB to Docstore

Click for: original source

Each day, Uber moves millions of people around the world and delivers tens of millions of food and grocery orders. This generates a large number of financial transactions that need to be stored with provable completeness, consistency, and compliance. By Piyush Patel, Jaydeepkumar Chovatia, and Kaushik Devarajaiah.

LedgerStore is an immutable, ledger-style database storing business transactions. LedgerStore provides signing/sealing of data to guarantee data completeness/correctness, strongly consistent indexes, and automatic data tiering. LedgerStore uses DynamoDB as its storage backend. Running LedgerStore in production for almost 2 years at Uber scale, we’d amassed a large amount of data as trips and orders volume grew. Over this period of time we realized that operating LedgerStore with DynamoDB as a backend was becoming expensive. Also having different databases in our portfolio creates fragmentation and makes it difficult to operate.

In this post today we are going to talk about rearchitecting some of the core components of LedgerStore on top of Docstore, Uber’s general-purpose multi-model database:

  • What is LedgerStore?
  • Data model
  • Data integrity
  • LedgerStore 2.0 design considerations
  • Architecture
    • Docstore table design
    • Data sealing
    • Data backfill (historical data from DynamoDB—more than 250 billion unique records (~300TB of data)
  • DynamoDB to Docstore migration

… and more. Authors have also taken a deep dive into the architecture and explained how the entire migration was designed and executed without impacting stringent SLAs and online flow. We liked this one: We backfilled 250 billion unique records and not a single data inconsistency has been detected so far, with the new architecture in production for over 6 months. Super interesting read!

[Read More]

Tags database cloud software-architecture distributed