This is the 15th installment of my blog series around Stream Processing and Analytics.
I’m once more a day late with this article, if it wouldn’t be because work, I could just argue that I have waited for the next version of Kafka being released 🙂 This was the case last night, with Kafka 0.10 and the Confluent Platform 3.0. Here two of the many tweets announcing the release:
Here are my 3 favorite new features in this new release:
- Kafka Streams: .
- Timestamps in Messages:
- Rack Awareness:
On a side note: Camus, a tool in the Kafka ecosystem used for ingesting Kafka messages into HDFS is no deprecated. To export data from Kafka to HDFS and Hive, Kafka Connect is the new recommended way.
As usual, find below the new blog articles, presentations, videos and software releases from last week:
News and Blog Posts
General
- Exactly Once Stream Processing Semantics ? Not Exactly by Pranab Gosh
- Moving Streaming Analytics Out of the Data Center by Mark Lochbihler
- Adopt complex event processing architecture across hybrid clouds by George Lawton
- Streaming data analytics: A better way to fight fraud and cybercrime by Todd Wright
- Cybersecurity Architecture: All about sensors by Michael Schiebel
- Applying the Kappa architecture in the telco industry by Nicolas Seyvet & Ignacio Mulas Viela
Apache Beam / Google Dataflow
- No shard left behind: dynamic work rebalancing in Google Cloud Dataflow byEugene Kirpichov
Apache Storm
- A Brief History of Apache Storm by Taylor Goetz
Apache Apex
- Apache Apex on MapR Converged Platform by Karen Whipple
Apache Flink
- JumpStarting Stream Processing with Kafka and Flink by Tarun Sapra
- Stream Processing for Everyone with SQL and Apache Flink by Fabian Hueske
- Apache Flink GA — Planning for the Future by Jim Scott
Apache Kafka
- Apache Kafka and MapR Streams: Terms, Techniques and New Designs by Ellen Friedman
- The Beginner’s Guide to Apache Kafka by Christy Wilson
- Getting started with Kafka using Scala Kafka Client and Akka by
- Announcing Apache Kafka 0.10 and Confluent Platform 3.0 by Neha Narkhede
- Kafkaesque Days at LinkedIn – Part 1 by Joel Koshy
Apache Quarks
Apache NiFi / Hortonworks HDF
- Making it not rain with Apache NiFi by Jeremy Dyer
Microsoft Azure Stream Analytics
- Azure Stream Analytics updates certifications and compliance by Ryan CrawCour
New Presentations
- Real-Time Analytics Visualized w/ Kafka + Streamliner + MemSQL + ZoomData by Anton Gorshow
- Near real time streaming with apache samza – Antispam use case by Michael Sklyar
- Lambda Architecture with Apache Spark by Taras Matyashovsky
- Spark Streaming Tips for Devs & Ops by Fede Fernandez & Fran Pérez
- Robust Stream Processing with Apache Flink by Jamie Grier
- Staying in Sync: From Transactions to Streams by Martin Kleppmann
- Streaming SQL w/ Apache Calcite by Julian Hyde
- Stream Processing with Kafka in Uber by Danny Yuan
- Kafka Connect by Datio
- Using the SDACK Architecture to Build a Big Data Product by Evans Ye
New Videos
- Apache Quarks on Raspberry Pi with Streaming Analytics by Alex Cook
- Staying in Sync: From Transactions to Streams by Martin Kleppmann
- Securing Apache Kafka by Jun Rao
- Apache Spark 2.0: Introduction to Structured Streaming by Michael Armbrust & Tathagata Das
- Introduction to Apache Apex and writing a big data streaming application by Pramod Immaneni
New Releases
Upcoming Events
- 5/24/2016 (online) – Apache Kafka and Real Time Stream Processing (Gluent Webinar)
- 5/24/2016 (London, UK) – Dive into Spark Streaming (Skills Matter)
- 5/24/2016 (Berlin, GE) – Apache Flink Meetup Berlin (Meetup)
- 5/24/2016 (Washington, US) – Spark Streaming and the Internet of Things (Meetup)
- 5/24/2016 (online) – How to Build Continuous Ingestion for the Internet of Things (Cloudera Webinar)
- 5/25/2016 (San Francisco, US) – Storm/Kafka Meetup: “Securing Kafka Clusters” (Meetup)
- 5/26/2016 (online) – Productionizing your Streaming Jobs (Databricks Webinar)
- 6/1-3/2016 (London, UK) – Real-time conference sessions (Strata + Hadoop World)
- 6/1/2016 (London, UK) – Apache Spark Users In London (Meetup)
- 6/2/2016 (Columbia, US) – Apache NiFi Hackathon and Office Hours (Meetup)
- 6/5-7/2016 (Berlin, GE) – Berlin Buzzwords
- 6/6/2016 (San Francisco, US) – Spark Meetup At Spark Summit (Meetup)
- 6/9/2016 (Paris, FR) – Building a Real-time Streaming Platform Using Kafka Streams and Kafka Connect (Meetup)
- 6/13/2016 (Amsterdam, NL) – GOTO Night: Stream Processing with Apache Flink and Mining Github (Meetup)
- 6/13/2016 (New York, US) – Apache Beam (Stream Processing @ Scale Track at QCon New York)
- 6/14/2016 (Lausanne, CH) – Apache Kafka A high-throughput distributed messaging system (Meetup)
- 6/15/2016 (London, UK) – Hortonworks Dataflow (HDF) Meetup London (Meetup)
- 7/5/2016 (San Francisco, US) – Building (and running) Netflix’s Data Pipeline using Apache Kafka (Meetup)
- 8/18/2016 (New York, US) – Apache NiFi – MiNiFi: Taking Dataflow Management to the Edge (Meetup)
Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!