Tag: Streaming
-
Modern Distributed Data Architecture with Event Streams, Stream Processing and Derived Data
Posted on November 12, 2020, Level beginner Resource Length medium
Some of the most interesting projects I worked on at LinkedIn involved building large scale real-time pricing and machine learning products. They required crafting fault-tolerant distributed data architectures to support model training, forecasting and dynamic control systems. By Luthfur Chowdhury.
Tags cloud streaming software-architecture big-data cio data-science
-
Symfony messenger with SQS and SNS aws-services
Posted on August 13, 2020, Level intermediate Resource Length long
Let' checkout how to connect Symfony with Amazon SQS and SNS Services by using a Symfony component. By Stefan Pöltl.
Tags queues php distributed miscellaneous performance streaming
-
Change data capture with Debezium: A simple how-to
Posted on May 19, 2020, Level intermediate Resource Length long
Eric Deandrea wrote this piece about one question that always comes up as organizations moving towards being cloud-native, twelve-factor, and stateless: How do you get an organization's data to these new applications?
Tags software-architecture streaming apache data-science queues
-
Building an adaptive, multi-tenant stream bus with Kafka and Golang
Posted on February 20, 2020, Level intermediate Resource Length medium
Back in the 2000s, SOAP/WSDL with ESB (Enterprise Service Bus) was the dominant server-side architecture for many companies. Since the 2010s, microservices and service mesh technologies have grown wildly and thus became the de-facto industry standards. By Xinyu Liu.
Tags software-architecture apache streaming apis devops web-development programming
-
Exploring an Apache Kafka to Pub/Sub migration: Major considerations
Posted on January 28, 2020, Level intermediate Resource Length medium
In many cases, Google's Pub/Sub messaging and event distribution service can successfully replace Apache Kafka, with lower maintenance and operational costs, and better integration with other Google Cloud services. By Leonid Yankulin.
Tags software-architecture apache streaming big-data machine-learning google
-
Why I recommend my clients NOT use KSQL and Kafka Streams
Posted on October 24, 2019, Level beginner Resource Length medium
An article by Jesse Anderson. He recommends his clients not use Kafka Streams because it lacks checkpointing. Kafka Streams also lacks and only approximates a shuffle sort. KSQL sits on top of Kafka Streams and so it inherits all of these problems and then some more.
Tags streaming software-architecture apache distributed
-
Using graph processing for Kafka Stream visualizations
Posted on September 9, 2019, Level intermediate Resource Length long
Article by David Allen. Focused on Graph processing for Kafka Stream visualizations. Apache Kafka® is great when one needs to dealing with streams, allowing you to conveniently look at streams as tables. Stream processing engines like KSQL furthermore give you the ability to manipulate all of this fluently.
Tags analytics apache streaming queues
-
Create your first AWS Lambda using Rust
Posted on December 6, 2018, Level intermediate Resource Length short
Blog post by Konstantin Kostov about how he created serverless function in Rust programming language and deployed it to AWS. It was an example AWS Lambda function tasked with checking if a provided serial number is correct and that it is unique (not already part of an existing dataset).
Tags programming functional-programming software serverless streaming
-
Parsing logs 230x faster with Rust
Posted on November 10, 2018, Level intermediate Resource Length medium
Andre Arko blog post about dealing with logs for very busy web application behind RubyGems.org. A single day of request logs was usually around 500 gigabytes on disk. They tried few hosted logging products, but at their volume they can typically only offer a retention measured in hours. The only thing they could think of to do with the full log firehose was to run it through gzip -9 and then drop it in AWS S3.
Tags json software programming serverless streaming
-
JVM Profiler: open source tool for tracing distributed JVM applications at scale
Posted on October 14, 2018, Level advanced Resource Length long
Bo Yang, Nan Zhu, Felix Cheung, Xu Ning from Uber Engineering team published blog post about JVM Profiles. Data is at the heart of strategic decision-making process at Uber. Right sizing the resources allocated to Spark applications and optimizing the operational efficiency of Uber data infrastructure requires fine-grained insights about these systems, namely their resource usage patterns.
Tags programming java distributed miscellaneous monitoring queues performance streaming
-
Apache Kafka is not for event sourcing
Posted on February 1, 2018, Level beginner Resource Length medium
Jesper Hammarbäck article in which he argues why Kafka is not the best tool for event sourcing. Kafka is a great tool for delivering messages between producers and consumers and the optional topic durability allows you to store your messages permanently. Forever if you'd like.
Tags software-architecture apache streaming big-data machine-learning
-
Sherlock: Near real time search indexing for commerce site
Posted on December 30, 2017, Level beginner Resource Length long
Prasanna Ranganathan from Flipkart published article about building a world-class e-commerce discovery experience through search. The dynamic nature of e-commerce poses unique challenges — stock units, availability, pricing, catalog data, etc. can all change at a very high rate and the system needs to keep up with the latest data lest the customer be disappointed.
Tags nosql software-architecture apache streaming