Latest writings

You can subscribe to my blog via RSS. Older articles are available in the blog archive.

  • Of Streams and Tables in Kafka and Stream Processing, Part 1

    In this article, perhaps the first in a mini-series, I want to explain the concepts of streams and tables in stream processing and, specifically, in Apache Kafka. Hopefully, you will walk away with both a better theoretical understanding but also more tangible insights and ideas that will help you solve your current or next practical use case better, faster, or both. Continue reading »

  • Integrating Kafka and Spark Streaming: Code Examples and State of the Game

    Spark Streaming has been getting some attention lately as a real-time data processing tool, often mentioned alongside Apache Storm. If you ask me, no real-time data processing tool is complete without Kafka integration (smile), hence I added an example Spark Streaming application to kafka-storm-starter that demonstrates how to read from Kafka and write to Kafka, using Avro as the data format and Twitter Bijection for handling the data serialization. In this post I will explain this Spark Streaming example in further detail and also shed some light on the current state of Kafka integration in Spark Streaming. All this with the disclaimer that this happens to be my first experiment with Spark Streaming. Continue reading »

♣ ♣ ♣

You can subscribe to my blog via RSS. Older articles are available in the blog archive.