Last week in Stream Processing & Analytics 5/24/2016

This is the 15th installment of my blog series around Stream Processing and Analytics.

I’m once more a day late with this article, if it wouldn’t be because work, I could just argue that I have waited for the next version of Kafka being released 🙂 This was the case last night, with Kafka 0.10 and the Confluent Platform 3.0. Here two of the many tweets announcing the release:

Announcing the Kafka 0.10 release, includes Kafka Streams, event time, rack awareness, & 400 other fixes & features https://t.co/MLCyXx87Ze

— Apache Kafka (@apachekafka) May 24, 2016

So excited to announce the release of @apachekafka 0.10 & Confluent Platform 3.0. Kafka Streams is here! https://t.co/yjoRXXPETE

— Neha Narkhede (@nehanarkhede) May 24, 2016

Here are my 3 favorite new features in this new release:

Kafka Streams: a library that turns Apache Kafka into a full featured, modern stream processing system. It includes a high level language for describing common stream operations (such as joining, filtering, and aggregating records), allowing developers to quickly develop powerful streaming applications. Kafka Streams offers a true event-at-a-time processing model, handles out-of-order data, allows stateful and stateless processing and can easily be deployed on many different systems.
Timestamps in Messages: every message in Kafka now has a timestamp field that indicates the time at which the message was produced. This enables stream processing by event-time in Kafka Streams.
Rack Awareness: isolates replicas so they are guaranteed to span multiple racks or availability zones. It allows all of Kafka’s durability guarantees to be applied to these larger architectural units, significantly increasing resilience and availability.

On a side note: Camus, a tool in the Kafka ecosystem used for ingesting Kafka messages into HDFS is no deprecated. To export data from Kafka to HDFS and Hive, Kafka Connect is the new recommended way.

As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

Exactly Once Stream Processing Semantics ? Not Exactly by Pranab Gosh
Moving Streaming Analytics Out of the Data Center by Mark Lochbihler
Adopt complex event processing architecture across hybrid clouds by George Lawton
Streaming data analytics: A better way to fight fraud and cybercrime by Todd Wright
Cybersecurity Architecture: All about sensors by Michael Schiebel
Applying the Kappa architecture in the telco industry by Nicolas Seyvet & Ignacio Mulas Viela

Apache Beam / Google Dataflow

No shard left behind: dynamic work rebalancing in Google Cloud Dataflow byEugene Kirpichov

Apache Storm

A Brief History of Apache Storm by Taylor Goetz

Apache Apex

Apache Apex on MapR Converged Platform by Karen Whipple

Apache Flink

JumpStarting Stream Processing with Kafka and Flink by Tarun Sapra
Stream Processing for Everyone with SQL and Apache Flink by Fabian Hueske
Apache Flink GA — Planning for the Future by Jim Scott

Apache Kafka

Apache Kafka and MapR Streams: Terms, Techniques and New Designs by Ellen Friedman
The Beginner’s Guide to Apache Kafka by Christy Wilson
Getting started with Kafka using Scala Kafka Client and Akka by David Piggot
Announcing Apache Kafka 0.10 and Confluent Platform 3.0 by Neha Narkhede
Kafkaesque Days at LinkedIn – Part 1 by Joel Koshy

Apache Quarks

Apache Quarks on Raspberry Pi with Streaming Analytics by Alex Cook

Apache NiFi / Hortonworks HDF

Making it not rain with Apache NiFi by Jeremy Dyer

Microsoft Azure Stream Analytics

Azure Stream Analytics updates certifications and compliance by Ryan CrawCour

New Presentations

Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData by Anton Gorshow
Near real time streaming with apache samza – Antispam use case by Michael Sklyar
Lambda Architecture with Apache Spark by Taras Matyashovsky
Spark Streaming Tips for Devs & Ops by Fede Fernandez & Fran Pérez
Robust Stream Processing with Apache Flink by Jamie Grier
Staying in Sync: From Transactions to Streams by Martin Kleppmann
Streaming SQL w/ Apache Calcite by Julian Hyde
Stream Processing with Kafka in Uber by Danny Yuan
Kafka Connect by Datio
Using the SDACK Architecture to Build a Big Data Product by Evans Ye

New Videos

Apache Quarks on Raspberry Pi with Streaming Analytics by Alex Cook
Staying in Sync: From Transactions to Streams by Martin Kleppmann
Securing Apache Kafka by Jun Rao
Apache Spark 2.0: Introduction to Structured Streaming by Michael Armbrust & Tathagata Das
Introduction to Apache Apex and writing a big data streaming application by Pramod Immaneni

New Releases

Upcoming Events

5/24/2016 (online) – Apache Kafka and Real Time Stream Processing (Gluent Webinar)
5/24/2016 (London, UK) – Dive into Spark Streaming (Skills Matter)
5/24/2016 (Berlin, GE) – Apache Flink Meetup Berlin (Meetup)
5/24/2016 (Washington, US) – Spark Streaming and the Internet of Things (Meetup)
5/24/2016 (online) – How to Build Continuous Ingestion for the Internet of Things (Cloudera Webinar)
5/25/2016 (San Francisco, US) – Storm/Kafka Meetup: “Securing Kafka Clusters” (Meetup)
5/26/2016 (online) – Productionizing your Streaming Jobs (Databricks Webinar)
6/1-3/2016 (London, UK) – Real-time conference sessions (Strata + Hadoop World)
6/1/2016 (London, UK) – Apache Spark Users In London (Meetup)
6/2/2016 (Columbia, US) – Apache NiFi Hackathon and Office Hours (Meetup)
6/5-7/2016 (Berlin, GE) – Berlin Buzzwords
6/6/2016 (San Francisco, US) – Spark Meetup At Spark Summit (Meetup)
6/9/2016 (Paris, FR) – Building a Real-time Streaming Platform Using Kafka Streams and Kafka Connect (Meetup)
6/13/2016 (Amsterdam, NL) – GOTO Night: Stream Processing with Apache Flink and Mining Github (Meetup)
6/13/2016 (New York, US) – Apache Beam (Stream Processing @ Scale Track at QCon New York)
6/14/2016 (Lausanne, CH) – Apache Kafka A high-throughput distributed messaging system (Meetup)
6/15/2016 (London, UK) – Hortonworks Dataflow (HDF) Meetup London (Meetup)
7/5/2016 (San Francisco, US) – Building (and running) Netflix’s Data Pipeline using Apache Kafka (Meetup)
8/18/2016 (New York, US) – Apache NiFi – MiNiFi: Taking Dataflow Management to the Edge (Meetup)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Enjoy IT – SOA, Java, Event-Driven Computing and Integration

Sharing my thoughts and experiences ….

Last week in Stream Processing & Analytics 5/24/2016

News and Blog Posts

New Presentations

New Videos

New Releases

Upcoming Events

News and Blog Posts

New Presentations

New Videos

New Releases

Upcoming Events

Share this:

Related