gschmutz 21:42 on March 14, 2017
Tags: beam ( 25 ), flink ( 92 ), kafka ( 248 ), oracle stream analytics, spark-streaming ( 219 ), Stream Processing ( 241 ), streaming-analytics ( 84 )

Last week in Stream Processing & Analytics – 13.03.2017

This is the 57th edition of my blog series blog series around Stream Processing and Analytics!

Every week I’m also updating the following two lists with the presentations/videos of the current week:

As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

Handling the Extremes: Scaling and Streaming in Finance by Jim Scott
Top IoT Trends for 2017: Edge Analytics, LPWA, Security by Sue Walsh
Moving Analytics to Real-Time by Mike Boyarsky

Apache Kafka / Kafka Streams / Confluent Platform

Highlights in the Apache Kafka and Stream Processing Community – March 2017 by Gwen Shapira
How Apache Kafka promises to be your enterprise’s central nervous system for data by Matt Asay
Interactive Queries in Apache Kafka Streams by Florian Trossbach
There’s so much more to Apache Kafka than just Hadoop by Clarke Patterson
Kafka Basics, Producer, Consumer, Partitions, Topic, Offset, Messages
by Serkan Sunel
Kafka and the Oracle database; making the connection – Part 3 by Mike Donovan

Apache Flink

Using Control Streams to Manage Apache Flink Applications by Scott Kidder
Apache Flink Cluster Setup on CentOS by Data Flair
Drivetribe’s Modern Take On CQRS With Apache Flink® by Aris Koliopoulos

Spark Streaming

Apache Spark Streaming – A Comprehensive Guide by Data Flair

Apache NiFi / Hortonworks HDF

Leveraging Apache NiFi for Agile Dataflows by Dan Young

New Presentations

The Power of the Log by Ben Stopford
Stream all the Things by Dean Wampler
Real-Time Processing Using AWS Lambda by Sidhartha Chauhan
Boost your IoT solutions to next level with azure stream analytics by Kaspar Lavik
Apache Kudu Webinar Series – Understanding and Unlocking the Value of Real-Time Data by Ryan Lippert & Michele Goetz
Apache Beam by Adil Oulghard
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassandra And Kafka by Sean Glover
Kafka & Couchbase Integration Patterns by Manuel Hurtado
Dive into Spark Streaming by Gerard Maas
Apache Spark and Oracle Stream Analytics by Prabhu Thukkaram
What’s new in Confluent 3.2 and Apache Kafka 0.10.2 by Clarke Patterson
Apache Flink’s Table & SQL API – unified APIs for batch and stream processing by Timo Walther
Enabling Rapid Business Insight into Data with Stream Analytics and GoldenGate by Robin Moffatt

New Videos

Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassandra And Kafka by Sean Glover
Apache Beam: Portable and Parallel Data Processing (Google Cloud Next ’17) by Frances Perry
HDInsight: Streaming Petabytes of IoT Data in Real-Time by Robert Mohan

Upcoming Events

14.3.2017 (Palo Alto, US) – Apache Kafka Meetup at Strata San Jose (Meetup)
15.3.2017 (Paris, FR) – Apache Flink: Use Cases (Meetup)
21.3.2017 (online) – Getting Started with Streaming Data and Stream Processing with Apache Kafka (KPI Webinar)
23.3.2017 (San Francisco, US) – Bay Area Apache Spark Meetup @ Intel in Santa Clara (Meetup)
28.3.2017 (Princeton, US) – Apache NiFi: Ingesting Enterprise Data @ Scale (Meetup)
29.3.2017 (online) – Part 2: Building Event-Driven Services with Apache Kafka (Confluent Webinar)
5.4.2017 (London, UK) – Realtime stream processing with coroutines (Meetup)
18.4.2017 (Zurich, CH) – Apache Flink Zurich Kickoff (Meetup)
25.4.2017 (online) – Part 3: Putting the Micro into Microservices with Stateful Stream Processing (Confluent Webinar)
27.4.2017 (Munich, GE) – 3rd Flink Meetup Munich: Flink 1.3, Queryable State & Spark Streaming (Meetup)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!

gschmutz 16:18 on October 3, 2016
Tags: flink ( 92 ), hdf ( 20 ), IoT ( 10 ), kafka ( 248 ), nifi ( 238 ), oracle stream analytics, osa ( 3 ), spark-streaming ( 219 ), Stream Processing ( 241 ), streaming-analytics ( 84 ), streamsets ( 74 )

Last week in Stream Processing & Analytics 10/3/2016

This is the 34th installment of my blog series around Stream Processing and Analytics.

As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

A Smart Manufacturing IT Architecture Needs Data Distribution, Consolidation and Stream Processing by Conrad Leiva
MQTT in the Enterprise by Matt Pavlovich
Event-Driven Microservices on the MapR Converged Data Platform by Rachel Silver
MemSQL Delivers an ‘Exactly Once’ Real-Time Pipeline by Alex Woodie
Streaming Analytics: Emerging from its Niche, Thanks to IoT by Philip Carnelley
Internet Of Things By The Numbers: Results from New Surveys by Gil Press
At Strata Hadoop, a Focus on Fast Data, IoT by Sue Walsh

Comparison

Apache Flink vs. Spark: Which one is best for streaming? by Donna-Margret Fernandez

Spark Streaming

Apache Spark Earns Datanami Awards for Machine Learning, Real-time Analytics, and More by Jules Damji
Long-running Spark Streaming Jobs on YARN Cluster by Marcin Kuthan
Apache Spark and IBM Streams working together in streaming analytics by Jacques Roy

Apache Kafka

Introducing Apache Kafka™ for the Enterprise by Neha Narkhede
Oracle Public Cloud and Kafka – Events powering the cloud – The Oracle PaaS Cloud Event BusHub by Lucas Jellema
Confluent Enterprise Reference Architecture by Confluent
Testing Topologies in Kafka Streams by Jendrik Poloczek

Apache Flink

A Beginner’s Guide to Apache Flink – 12 Key Terms, Explained by Andrés Vivanco Villamar
Flink How To: A Demo of Apache Flink with Docker by Gezim Sejdiu

Concord

Akamai picks up stream processing startup Concord by Maria Deutscher

Oracle Stream Analytics

Oracle Streams Analytics platform – Is it time to reverse traditional analytics? by Thanos Terentes Printzios

StreamSets

Using Salesforce JDBC Driver and MongoDB with StreamSets by

Saikrishna Teja Bobba

Apache NiFi / Hortonworks Data Flow (HDF)

Creating a Process Group for Twitter Data in NiFi by Michael Young

New Presentations

New stream processing platform with apache flink by Hironori Ogibayashi
Oracle Stream Analytics – Simplifying Stream Processing by Guido Schmutz
Introduction To Streaming Data and Stream Processing with Apache Kafka by Jay Kreps
Highly Available Spark Stream Processing with Confluent Platform and DataStax Enterprise by Ryan Svihla & Wei Deng
Big Data Solution Architectures by Guido Schmutz
Oracle Streams Analytics platform – Is it time to reverse traditional analytics? by Milomir Vojvodic & Alessandro Cagnetti
Reactive Integrations with Akka Streams by Johan Andrén & Konrad Malawski
Real-Time Streaming Data on AWS by Roy Ben-Alta & Carlos Vinicius
Apache Kafka by Otávio Carvalho
Hadoop application architectures – using Customer 360 as an example by Mark Grover & Ted Malaska & Jonathan Seidman
Fast Big Data Ingest into SAP HANA by Denis King
Building a Spark Streaming App with DSE File System by Rocco Varela
Adaptive Data Cleansing with StreamSets and Cassandra by Pat Patterson
Apache Spark Structured Streaming for Machine Learning by Hoden Karau
Building Real-Time Data Analytics Applications on AWS by Keith Steward
Oracle Data Integration by Jeff Pollock
Streaming ML with Flink by Martin Balassi
Getting It Right Exactly Once: Principles for Streaming Architectures by Darryl Smith
Internet of Things (IoT) and Big Data by Guido Schmutz

New Videos

Hands-On with Azure Stream Analytics by Jeff Prosise
Athena – Stream Processing Platform @ Uber by Chinmay Soman
Adaptive Data Cleansing with StreamSets and Cassandra by Pat Patterson
Highly Available Spark Stream Processing with Confluent Platform and DataStax Enterprise by Ryan Svihla & Wei Deng
Building a Spark Streaming App with DSE File System by Rocco Varela
Introduction To Streaming Data and Stream Processing with Apache Kafka by Jay Kreps

Upcoming Events

10/4/2016 (Austin, US) – Leveraging Kafka, Kafka Streams and Kafka Connect for Fast Data Streams, Analytics and Machine Learning using Scala (Reactive Summit 2016)
10/5/2016 (online) – Harness Your Dataflow Operations with StreamSets Dataflow Performance Manager (StreamSets Webinar)
10/6/2016 (online) – Deep Dive into Kafka (Confluent Online Talk series)
10/6/2016 (Milan, IT) – Streaming analytics for massive data sets (Meetup)
10/6/2016 (online) – Introduction to Hortonworks DataFlow (Hortonworks Webinar Data-in-Motion Series)
10/12/2016 (online) – Make Your Big Data Ecosystem Work Better for You (Hortonworks Webinar Data-in-Motion Series)
10/13/2016 (online) – Getting to the Goal of Big Data Faster: Enterprise Readiness for Data in Motion (Hortonworks Webinar Data-in-Motion Series)
10/13/2016 (London, UK) – Distributed stream processing with Apache Kafka (Meetup)
10/18/2016 (online) – A Paradigm Shift to Business as Usual: Real-Time DataFlows in Record Time (Hortonworks Webinar Data-in-Motion Series)
10/26/2016 (online) – How-To Guide for New Features of Hortonworks DataFlow 2.0 (Hortonworks Webinar Data-in-Motion Series)
10/27/2016 (online) – Data Integration with Apache Kafka (Confluent Online Talk series)
11/3/2016 (online) – Edge intelligence for IoT with Apache MiNiFi (Hortonworks Webinar Data-in-Motion Series)
11/8/2016 (online) – Apache Kafka and Apache NiFi: Better Together (Hortonworks Webinar Data-in-Motion Series)
11/17/2016 (online) – Demystifying Stream Processing with Apache Kafka (Confluent Online Talk series
12/1/2016 (online) – A Practical Guide to Selecting a Stream Processing Technology (Confluent Online Talk series)
12/15/2016 (online) – Streaming in Practice: Putting Apache Kafka in Production (Confluent Online Talk series)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!

gschmutz 18:15 on September 27, 2016
Tags: beam ( 25 ), flink ( 92 ), heron ( 23 ), kafka ( 248 ), kafka-streams ( 176 ), nifi ( 238 ), oracle stream analytics, spark-streaming ( 219 ), storm ( 46 ), Stream Processing ( 241 ), streaming-analytics ( 84 ), streamsets ( 74 )

Last week in Stream Processing & Analytics 9/26/2016

This is the 33rd installment of my blog series around Stream Processing and Analytics.

As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

What Happens in 60 Seconds in the Internet Economy by Gil Press
Streaming analytics in a digitally industrialized world by Tugdual Grall
Streaming Analytics is Here to Stay by Giles Nelson

Spark Streaming

Using Spark Streaming, Apache Kafka, and Object Storage for Stream Processing on Bluemix by Ilya Drabenia

Apache Storm

SKA Data Pipelines by Benjamin D. Parrish

Apache Kafka

Getting started with Apache Kafka by Jan van Zoggel
Spring Kafka Tutorial – Getting Started with Spring for Apache Kafka by howtoprogram
Spring Kafka – Multi-threaded Message Consumption by howtoprogram
Want to migrate to AWS Cloud? Use Apache Kafka by Gwen Shapira
Kafka Streams – A First Impression by Dave Bechberger
Securing an Apache Kafka broker – part III by Colm O hEigeartaigh
A Technical Review of Kafka and DistributedLog by Sijie Guo

Apache Flink

Event Time in Apache Flink Stream Processing | Whiteboard Walkthrough by Ellen Friedman
Announcing the dA Platform, our distribution of Apache® Flink® by Kostas Tzoumas
Flink Distro Now Available from data Artisans by Alex Woodie

StreamSets

Creating a Post-Lambda World with Apache Kudu by Arvind Pabhakar
MySQL Database Change Capture with MapR Streams, Apache Drill, and StreamSets by Raphaël Velfre
Data Performance Management: Unlocking the Enterprise’s Greatest Asset by Peter Sonsini and Karthik Kumar
New StreamSets Dataflow Performance Manager™ Unifies Operational Control Over Data in Motion by insideBIGDATA
Announcing StreamSets Data Collector version 2.0 by Kirit Basu

Apache NiFi / Hortonworks Data Flow (HDF)

HDF Speed Test Contest by Anna Yong
Learning the Ropes of Apache NiFi by Hortonworks
Apache Nifi Overview and Comparison Study by The Data Team
Analyzing images in HDF 2.0 using TensorFlow by Timothy Spann
Using the TLS Toolkit to simplify security by Sebastian Caroll
HDF 2.0: Apache Nifi integration with Apache Ambari/Ranger by Ali Bajwa
Hortonworks DataFlow 2.0 Gets a Fresh Face by Mark Herring
Using NiFi GetTwitter, UpdateAttributes and ReplaceText processors to modify Twitter JSON data by Michael Young
Google Vision & Apache NiFi – Making Advanced Computer Vision Feasible by Jeremy Dyer

New Presentations

Real-Time Big Data Streaming Analytics – An Introduction by StreamAnalytix
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming by Michael Rainey
Real-Time Data Processing Using AWS Lambda by Cecilia Deng
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems by Pooyan Jamshidi
Twitter #RealTime by Karthik Ramasamy
End to End Akka Streams / Reactive Streams – from Business to Socket by Konrad Malawski
Apache Beam – A Unified Model for Batch and Streaming Data Processing by Kenneth Knowles
Real time ETL processing using Spark streaming by Veeramani Moorthy
Designing & Optimizing micro batching systems with Cassandra, Spark and Kafka by Anath Ram & Murali Kannan
No shard left behind: Dynamic work rebalancing in Apache Beam by Malo Denielou
Streaming ML with Flink by Marton Balassi
Rate limiters in big data systems by Sandeep Joshi
How Can We Take Flink Forward? by Ted Dunning
Fast In-Memory SQL on C* with Ignite by Rachel Pedreschi & Igor Rudyak
Lambda Architecture with Cassandra by Vaibhav Puranik
Real-time Data Integration with Kafka & Cassandra by Ewen Cheslack-Postava
CassieQ: The Distributed Message Queue Built on Cassandra by Anton Kropp

New Videos

Fast In-Memory SQL on C* with Ignite by Rachel Pedreschi & Igor Rudyak
Lambda Architecture with Cassandra by Vaibhav Puranik
Real-time Data Integration with Kafka & Cassandra by Ewen Cheslack-Postava
CassieQ: The Distributed Message Queue Built on Cassandra by Anton Kropp
Apache Kafka Quotas by Aditya Auradkar
Apache Kafka Security by Harsha Chintalapani & Parth Brahmbhatt
Tuning Kafka for low latency guaranteed messaging by Jiangjie (Becket) Qin
The Stream Processor as a Database by Jamie Grier
Joining Infinity: Windowless Stream Processing with Flin by Sanjar Akhmedov
Robust Stream Processing with Apache Flink by Jamie Grier
Scalable Complex Event Processing on Samza @Uber by Shuyi Chen
Real-Time Data Processing Using AWS Lambda by Cecilia Deng
Realtime ETL Processing using Spark streaming by Veeramani Moorthy
Declarative stream processing with StreamSQL and CEP by Fabian Hueske & Till Rohrmann
How to Apply Machine Learning (R, Apache Spark, H2O.ai) To Real Time Streaming Analytics by Kai Wähner

New Releases

Upcoming Events

9/27/2016 (online) – Introduction to Streaming Data and Processing with Apache Kafka (Confluent Online Talk series)
9/29/2016 (Atlanta, US) – Diving Into Big Data Technologies: Hadoop, Hive and Apache NiFi (Meetup)
10/4/2016 (Austin, US) – Leveraging Kafka, Kafka Streams and Kafka Connect for Fast Data Streams, Analytics and Machine Learning using Scala (Reactive Summit 2016)
10/5/2016 (online) – Harness Your Dataflow Operations with StreamSets Dataflow Performance Manager (StreamSets Webinar)
10/6/2016 (online) – Deep Dive into Kafka (Confluent Online Talk series)
10/6/2016 (online) – Introduction to Hortonworks DataFlow (Hortonworks Webinar Data-in-Motion Series)
10/12/2016 (online) – Make Your Big Data Ecosystem Work Better for You (Hortonworks Webinar Data-in-Motion Series)
10/13/2016 (online) – Getting to the Goal of Big Data Faster: Enterprise Readiness for Data in Motion (Hortonworks Webinar Data-in-Motion Series)
10/13/2016 (London, UK) – Distributed stream processing with Apache Kafka (Meetup)
10/18/2016 (online) – A Paradigm Shift to Business as Usual: Real-Time DataFlows in Record Time (Hortonworks Webinar Data-in-Motion Series)
10/26/2016 (online) – How-To Guide for New Features of Hortonworks DataFlow 2.0 (Hortonworks Webinar Data-in-Motion Series)
10/27/2016 (online) – Data Integration with Apache Kafka (Confluent Online Talk series)
11/3/2016 (online) – Edge intelligence for IoT with Apache MiNiFi (Hortonworks Webinar Data-in-Motion Series)
11/8/2016 (online) – Apache Kafka and Apache NiFi: Better Together (Hortonworks Webinar Data-in-Motion Series)
11/17/2016 (online) – Demystifying Stream Processing with Apache Kafka (Confluent Online Talk series
12/1/2016 (online) – A Practical Guide to Selecting a Stream Processing Technology (Confluent Online Talk series)
12/15/2016 (online) – Streaming in Practice: Putting Apache Kafka in Production (Confluent Online Talk series)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!

gschmutz 18:04 on August 2, 2016
Tags: azure-stream-analytics ( 7 ), kafka ( 248 ), kafka-streams ( 176 ), nifi ( 238 ), Oracle ( 9 ), oracle stream analytics, spark-streaming ( 219 ), Stream Processing ( 241 ), streaming-analytics ( 84 ), streamsets ( 74 )

Last week in Stream Processing & Analytics 8/2/2016

This is the 25th installment of my blog series around Stream Processing and Analytics.

As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

Comparison

Big data brawlers: 4 challengers to Spark by Serdar Yegulalp
Apache Kafka and Google Cloud Pub/Sub by Jesse Anderson

Apache Beam

Apache Beam: The Case for Unifying Streaming APIs by Andrew Psaltis

Apache Kafka / Kafka Streams

Design and Deployment Considerations for Deploying on AWS by Alex Loddengaard
Kafka Connect JDBC – Oracle – Number of groups must be positive by Robin Moffatt
Streaming MySQL tables in real-time to Kafka by Prem Santosh & Udaya Shankar
Confluent and Mesosphere Drive the “Container 2.0” Era With Enterprise-Grade Kafka on DC/OS by Market Wired
Log Compaction | Highlights in the Apache Kafka and Stream Processing Community | August 2016 by Gwen Shapira
How Apache Kafka takes streaming data mainstream by Matt Asay
The dog ate my schema… or what is your excuse not to use the Schema Registry with Apache Kafka? by Jaroslaw Kijanowski
Apache Kafka Connect MQTT Source Tutorial by howtoprogram

Spark Streaming

Structured Streaming In Apache Spark by Matei Zaharia, Tathagata Das, Michael Armbrust and Reynold Xin
Continuous Applications: Evolving Streaming in Apache Spark 2.0 by Matei Zaharia

Streamsets

Standard Deviations on Cassandra – Rolling Your Own Aggregate Function by Pat patterson

Apache NiFi / Hortonworks Data Flow (HDF)

Using Apache Flume Sources and Sinks with Apache NiFi 0.70 by Timothy Spann
Crash Courses on Spark, IoT, Hadoop, NiFi, DataScience, and More! by Timothy Spann
Accessing Facebook Page Data from Apache NiFi by Timothy Spann
Use NiFi to Lessen the Friction of Moving Data by Hays Hutton

Oracle Stream Analytics

Stream Analytics and Processing with Kafka and Oracle Stream Analytics by Robin Moffatt

New Presentations

Apache Beam: The Case for Unifying Streaming APIs by Andrew Psaltis
Handson with Twitter Heron by Karthik Ramasamy
Apache Apex: Stream Processing Architecture and Applications by Thomas Weise
Getting started with azure event hubs and stream analytics services by Vladimir Bychkov
Apache Beam and Google Cloud Dataflow by Szabolcs Feczak
Introduction to Kafka connect by Himani Arora
Building a fully-automated Fast Data Platform by Bernd Zuther

New Videos

Apache Beam: The Case for Unifying Streaming APIs by Andrew Psaltis

New Release

Upcoming Events

8/2/2016 (online) – Real-Time Data Streaming from Oracle to Apache Kafka (Confluent Webinar)
8/2(2016 (online) – Streaming Analytics for Real-Time Action – Best Practices for Getting Started (TDWI Webinar)
8/3/2016 (Moutain View, US) – Building a Real-time Streaming Platform Using Kafka Streams and Kafka Connect (Meetup)
8/9/2016 (online) – Evolving Big Data ETL in the Face of Data Drift (StreamSets Webinar)
8/17/2016 (Sunnyvale, US) – 53rd Bay Area Hadoop User Group (HUG) Meetup (Meetup)
8/18/2016 (New York, US) – Apache NiFi – MiNiFi: Taking Dataflow Management to the Edge (Meetup)
8/23/2016 (Mountain View, US) – Stream Processing Meetup @ LinkedIn (Meetup)
8/25/2016 (San Francisco, US) – Focusing on Ingest into Hive and Streaming with Kafka (Meetup)
8/31/2016 (Palo Alto, US) – Heron@Twitter (Meetup)
9/8/2016 (San Francisco, US) – Adaptive Data Cleansing with StreamSets and Cassandra (Cassandra Summit)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!

gschmutz 19:36 on July 31, 2016
Tags: flink ( 92 ), kafka ( 248 ), kafka-streams ( 176 ), nifi ( 238 ), oracle stream analytics, spark-streaming ( 219 ), Stream Processing ( 241 ), streaming-analytics ( 84 ), streamsets ( 74 )

Last week in Stream Processing & Analytics 7/25/2016

This is the 24th installment of my blog series around Stream Processing and Analytics.

A few days later as usual, no wasn’t busy hunting Pokemons 😉 was teaching about Hadoop, NoSQL and Stream Processing, both internal to train our own employees as well as for customers, with a total of 16 course days in July! Event though it’s almost one week later, I decided to still submit this post, in order to keep the 1 week rhythm.

As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

Architecture: Real-Time Stream Processing for IoT by Google Cloud Platform
Real-time Message-driven Service Oriented Architecture: Bringing the Boom! by Jim Scott
Real-Time Collection, Enrichment and Analysis of Set-Top Box Data by Katherine Rincon

Comparison

Apache Storm vs Twitter Heron: Twitter’s motivation behind building another real-time stream processing engine by Big Data Analytics Guide

Apache Flink

Savepoints in Apache Flink Stream Processing – Whiteboard Walkthrough by Stephan Ewen
The Importance of Apache Flink in Processing Streaming Data by Pal Kaushik

Apache Kafka / Kafka Streams

Notes from the First Kafka Summit by Avi Flax
Oracle GoldenGate Adapter for Confluent Kafka Connect by Thomas Vengal
Secure Stream Processing with Kafka Streams by Michael Noll
Understanding, Operating, and Monitoring Apache Kafka by David Brinegar
Apache Kafka: The Next Generation of Data Management by Jordan Martz

Spark Streaming

Apache Spark 2.0: Speed, Structured Streaming, and Simplification by JulesDamji
Real Time Analytics with Spark Streaming by Gavin Lewis

Concord

Apache NiFi / Hortonworks Data Flow (HDF)

IoT Example in Apache NiFi: Consuming and Producing MQTT Messages by Timothy Spann
Using the New HiveQL Processors in Apache NiFi 0.7.0 by Timothy Spann

StreamSets

Announcing Data Collector ver 1.5.1.0 by Kirit Basu

Oracle Stream Analytics

An Introduction to Oracle Stream Analytics by Robin Moffatt

New Presentations

Apache Apex: Stream Processing Architecture and Applications by Thomas Weise
Dataflow with Apache NiFi – Crash Course by Aldrin Piri
SensorBee: Stream Processing Engine in IoT by Daisuke Tanaka

New Videos

Twitter’s real-time data stack by Karthik Ramasamy and Sijie Guo

New Releases

StreamSets 1.5.1.0

Upcoming Events

7/28/2016 (Toronto, CA) – Apache NiFi presentation by Joe Witt (Meetup)
7/30/2016 (San Francisco, CA) – Hands-on with Twitter Heron (Meetup)
8/2/2016 (online) – Real-Time Data Streaming from Oracle to Apache Kafka (Confluent Webinar)
8/2(2016 (online) – Streaming Analytics for Real-Time Action – Best Practices for Getting Started (TDWI Webinar)
8/3/2016 (Moutain View, US) – Building a Real-time Streaming Platform Using Kafka Streams and Kafka Connect (Meetup)
8/9/2016 (online) – Evolving Big Data ETL in the Face of Data Drift (StreamSets Webinar)
8/18/2016 (New York, US) – Apache NiFi – MiNiFi: Taking Dataflow Management to the Edge (Meetup)
8/31/2016 (Palo Alto, US) – Heron@Twitter (Meetup)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!

gschmutz 21:52 on July 19, 2016
Tags: flink ( 92 ), kafka ( 248 ), kafka-streams ( 176 ), nifi ( 238 ), oracle stream analytics, spark-streaming ( 219 ), storm ( 46 ), Stream Processing ( 241 ), streaming-analytics ( 84 ), streamsets ( 74 )

Last week in Stream Processing & Analytics 7/19/2016

This is the 23rd installment of my blog series around Stream Processing and Analytics.

As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

Billions of Messages a Day – Yelp’s Real-time Data Pipeline by Justin C
Next Generation Data Architecture by Kevin Schmidt
Batch vs. Stream Processing by Claudio Luis
Real-World Examples of Real-Time Log File Monitoring by Katherine Rincon
Survey finds surge in real-time and streaming analytics adoption by Paul Gillin
Big data vs. the Internet of Things: how the projects differ by Ben Rossi
Expert Interview: Part 2 — Dr. Ellen Friedman Discusses Streaming Data Architectures and What to Look for in a Good Streaming Message Layer by Christy Wilson
Mid-year Updates for Big Data Trends: Apache Kafka, Spark, Flink, Drill and More by Ellen Friedman

Comparison

Apache Flink vs Spark – Will one overtake the other? by DeZyre

Apache Storm

Instrumenting user-defined metric to analysis your topology by Jungtaek Lim
Microbenchmarking Apache Storm 1.0 Performance by Roshan Naik & Sapin Amin

Apache Kafka / Kafka Streams

Simple Spatio-Temporal Windowing With Kafka Streams by Tom Faulhaber
Brain Monitoring with Kafka, OpenTSDB, and Grafana by Jeff Lam
Kafka Streams by beyond the lines
Powering the Heroku Platform API: A Distributed Systems Approach Using Streams and Apache Kafka by Scott Persinger
Data Streaming with Apache Kafka & MongoDB by MongoDB
Kafka Connect – HDFS with Hive Integration – SchemaProjectorException – Schema version required by Robin Moffatt

Concord

Concord Claims 10x Performance Edge on Spark Streaming by Alex Woodie
Distributed Runtimes: at-least-once is here by Alexander Gallego

Apache NiFi / Hortonworks Data Flow (HDF)

Using Apache NiFi 0.7.0’s New PutSlack Processor by Timothy Spann
Learning the ropes of Apache NiFi by hortonworks

Oracle Stream Analytics

Accessing and Analyzing Twitter Feeds with Oracle Stream Analytics (Part 2) by John Featherly

New Presentations

Resource Aware Scheduling in Storm by Boyang Jerry Peng
Don’t Cross The Streams – Data Streaming And Apache Flink by John Gorman
Big Data Berlin v8.0 Stream Processing with Apache Apex by Thomas Weise
Apache NiFi Meetup by Timothy Spann
From Device to Data Center to Insights by Taylor Goetz
Deep Dive and Best Practices for Real Time Streaming Applications by Roy Ben-Alta
Understand Storm in pictures by zqhxuyuan
Building a real-time streaming platform using Kafka Connect + Kafka Streams by Jeremy Custenborder

New Videos

Jay Kreps, Building a Real-time Streaming Platform Using Kafka Streams… by Jay Kreps
Deep Dive and Best Practices for Real Time Streaming Applications by Roy Ben-Alta
Concord: Simple & Flexible Stream Processing on Apache Mesos by Shinji Kim
Time series analytics for Big Data and IoT with Kx by Finton Quill
Apache NiFi Crash Course by Hortonworks

New Releases

Apache NiFi 0.7.0

Upcoming Events

7/20/2016 (online) – Streaming data ingest and processing with Kafka (Confluent Webinar)
7/21/2016 (San Francisco, US) – Apache Spark Streaming with Apache NiFi and Apache Kafka (Meetup)
7/28/2016 (Toronto, CA) – Apache NiFi presentation by Joe Witt (Meetup)
7/30/2016 (San Francisco, CA) – Hands-on with Twitter Heron (Meetup)
8/2/2016 (online) – Real-Time Data Streaming from Oracle to Apache Kafka (Confluent Webinar)
8/2(2016 (online) – Streaming Analytics for Real-Time Action – Best Practices for Getting Started (TDWI Webinar)
8/18/2016 (New York, US) – Apache NiFi – MiNiFi: Taking Dataflow Management to the Edge (Meetup)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!

gschmutz 08:34 on June 29, 2016
Tags: flink ( 92 ), kafka ( 248 ), nifi ( 238 ), oracle stream analytics, spark-streaming ( 219 ), Stream Processing ( 241 ), streaming-analytics ( 84 ), streamsets ( 74 )

Last week in Stream Processing & Analytics 6/28/2016

This is the 20th installment of my blog series around Stream Processing and Analytics.

As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

Big Data or Bad Data? Survey Shows Enterprises Struggle to Manage Big Data Flows by Brittney Danon
Avoid These Five Big Data Governance Mistakes by Alex Woodie
RESTful, Real-Time, Streaming Analytics and IoT by Jordan Thomas-Green
Making Sense of Stream Processing by Martin Kleppmann
Survey Shows Enterprises Struggling with Bad Data by Rick Bilodeau
4 Predictions For How Data Transforms Everything by Scott Gnau
OpsClarity survey reveals 92% of companies are increasing investment in real-time analysis of human and machine-generated data by SD Times
RESTful, Real-Time, Streaming Analytics and IoT by Jordan Thomas-Green

Apache Flink

Tuning Tips for MapR Streams with Apache Flink – Whiteboard Walkthrough by Terry He

Apache Spark Streaming

Manjeet Chayel Analyze Realtime Data from Amazon Kinesis Streams Using Zeppelin and Spark Streaming by Manjeet Chayel
How-to: Detect and Report Web-Traffic Anomalies in Near Real-Time by Jan Kunigk
Monitoring Apache Spark Streaming: Understanding Key Metrics by Swaroop Ramachandra

Apache Kafka

Distributed, Real-time Joins and Aggregations on User Activity Events using Kafka Streams by Michael Noll
Data Ingestion with Apache Flume sending events to Apache Kafka by Rafael Salerno
Oracle GoldenGate Big Data Adapter: Apache Kafka Producer by Loren Penton
Just Enough Kafka For The Elastic Stack, Part 2 by Suyog Rao
Why Streaming? with Apache Kafka by Lewis Gavin
Stream Processing Hard Problems – Part 1: Killing Lambda by Kartik Paramasivam
Build and monitor Kafka pipelines with Confluent Control Center by Joseph Adler
How to read to the end of a Kafka topic by James Cheng

Apache Beam / Google Dataflow

A Quick Demo of Apache Beam with Docker by Emanuele Cesena

Apache NiFi

Prescient Transforms 48,000+ Data Sources in Real Time with Apache NiFi by Anna Yong
Using Solr’s Extracting Request Handler with Apache NiFi by Bryan Bende

StreamSets

Ingesting Sensor Data on the Raspberry Pi with StreamSets Data Collector by Pat Patterson
How To Install StreamSets On The MapR Sandbox by C. Warman

New Presentations

Fast Data Overview by Chuck Scyphers
Getting Started with Amazon Kinesis by Amazon
Dataflow with Apache NiFi by Aldrin Piri

New Videos

Building a Real-time Streaming Platform by Neha Narkhede
Streaming ETL for All by Joey Echeverria
Monitoring Big Data Streaming Applications & Apache Apex (Hadoop) operations byDavid Yan

New Releases

Oracle Stream Analytics 12.2.1.1.0

Upcoming Events

6/27/2016 (San Jose, US) – Building Big Data applications with Apache Beam and Apache Apex (Meetup)
6/27/2016 (San Jose, US) – NiFi Meetup at Hadoop Summit (Meetup)
6/28/2016 (London, UK) – Crowdmix – An Event Based Social Music Platform & Kafka 0.10 New Features (Meetup)
7/5/2016 (San Francisco, US) – Building (and running) Netflix’s Data Pipeline using Apache Kafka (Meetup)
7/6/2016 (Atlanta, US) – StreamSets, For The Coding Minimalist In All of Us (Meetup)
7/14/2016 (Princeton, US) – Apache NiFi (Meetup)
7/14/2016 (Austin, US) – Kafka-Streams talk (Meetup)
7/14/2016 (San Francisco, US) – Expert Panel on Streaming Analytics Technologies (Meetup)
7/19/2016 (Munich, GE) – Apache Apex: Stream Processing Architecture and Applications (Meetup)
7/19/2016 (Taipei, TW) – Stream Processing with Apache Flink (Meetup)
7/20/2016 (online) – Streaming data ingest and processing with Kafka (Confluent Webinar)
7/21/2016 (San Francisco, US) – Apache Spark Streaming with Apache NiFi and Apache Kafka (Meetup)
7/28/2016 (Toronto, CA) – Apache NiFi presentation by Joe Witt (Meetup)
8/2(2016 (online) – Streaming Analytics for Real-Time Action – Best Practices for Getting Started (TDWI Webinar)
8/18/2016 (New York, US) – Apache NiFi – MiNiFi: Taking Dataflow Management to the Edge (Meetup)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!

gschmutz 20:29 on June 13, 2016
Tags: azure-stream-analytics ( 7 ), beam ( 25 ), flink ( 92 ), kafka ( 248 ), oracle stream analytics, spark-streaming ( 219 ), storm ( 46 ), Stream Processing ( 241 ), streaming-analytics ( 84 ), streamsets ( 74 )

Last week in Stream Processing & Analytics 6/13/2016

This is the 18th installment of my blog series around Stream Processing and Analytics.

There were two conferences last week with quite a lot of talks around stream processing: the Spark Summit in San Francisco and the Berlin Buzzwords.
Berlin Buzzwords did a good job in recording the sessions and all of them are already available and the ones talking about Stream Processing listed below.

Last week I have done some work on Oracle Stream Analytics and made the Docker support available.

As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

Real-Time Analytics: Six Steps For Fast, Precise Decision-Making by Roy Schulte
Beyond Real-time Data Applications – Whiteboard Walkthrough by Ellen Friedman

Comparison

Top 18 Open Source and Commercial Stream Analytics Platforms by predictiveanalyticstoday

Apache Storm

Getting Started With Azure HDInsight: Create Apache Storm Cluster by Kuppurasu Nagaraj

Apache Flink

Data Streaming Architecture with Apache Flink by Srini Penchikala

Apache Spark Streaming

Spark Summit keynote explores structured streaming, innovation in deep learning by Marlene Den Bleyker
Spark Streaming Testing with Scala Example by Todd McGrath

Apache Kafka

Kafka Connect Sink for PostgreSQL from JustOne Database by Duncan Pauly
Interest grows in Apache Kafka, edge of network for IoT by Marlene Den Bleyker
‘Real-Time’ Online Machine Learning with Apache Kafka and Sklearn [plus other buzz words] by Bryan Travis Smith

Apache Beam / Google Dataflow

Understanding timing in Cloud Dataflow pipelines by Robert Burke

Apache NiFi / Hortonworks HDF

Testing ExecuteScript processor scripts by Matt Burgess
Data Injestion Apache Nifi for Data Collection by Don Jernigan

StreamSets

Ingesting MQTT Traffic into Riak TS via RabbitMQ and StreamSets by Pat Patterson

Concord

Concord Leverages Mesos for High Performance Stream Processing by Susan Hall

Oracle Stream Analytics

Providing Oracle Stream Analytics 12c environment using Docker by Guido Schmutz
Real-Time Data Streaming & Exploration with Oracle GoldenGate & Oracle Stream Analytics by Issam Hijazi
Accessing and Analyzing Twitter Feeds with Oracle Stream Analytics (Part 1) by John Featherly

Microsoft Stream Analytics

How To Build a Real-Time IoT Dashboard with Azure IoT Hub, Azure Stream Analytics and Power BI by John Gallant

New Presentations

A Data Streaming Architecture with Apache Flink by Robert Metzger
A Deep Dive into Structured Streaming by Tathagata Das
So you think you can Stream? by Parkash Chockalingam
Structuring Spark: Dataframes, Datasets And Streaming by Michael Armbrust
The Stream Processor as the Database by Stephan Ewen
Building Realtim Data Pipelines with Kafka Connect and Spark Streaming by Guozhang Wang
Spark streaming , Spark SQL by Jerry Jung
Streaming Analytics & CEP – Two sides of the same coin? by Till Rohrmann
Continuous Processing with Apache Flink by Stephan Ewen
Graphs as Streams: Rethinking Graph Processing in the Streaming Era by Vasia Kalavri
Architectual Comparison of Apache Apex and Spark Streaming by Thomas Weise
Application development and data in the emerging world of stream processing by Neha Narkhede
Introducing Kafka Streams, the new stream processing library of Apache Kafka by Michael Noll

New Videos

Fundamentals of stream processing with Apache Beam by Tyler Akidau
What we’ve learned from developing a stream processing platform at scale by Amit Sela
Architectual Comparison of Apache Apex and Spark Streaming by Thomas Weise
An Introduction to Azure Stream Analytics by Ashish Bhatia
Data Streaming for Connected Devices with Azure Stream Analytics by Juan Manuel Servera
Twitter Heron at Scale by Hakka Labs
Streaming Analytics & CEP – Two sides of the same coin? by Till Rohrmann
Scaling Yelp’s Logging Pipeline with Apache Kafka by Enrico Canzonieri
Real Time Marketing with Kafka, Storm, Cassandra and a pinch of Spark by Volker Janz
Application development and data in the emerging world of stream processing by Neha Narkhede
Help I need a stream processor – learning how to chose between Spark, Flink, Samza, and Storm by Andrew Psaltis
Graphs as Streams: Rethinking Graph Processing in the Streaming Era by Vasia Kalavri
Google Dataflow: The new open model for batch and stream processing by Felipe Hoffa
SMACK Stack – Data done Right by Stefan Siprell
A Data Streaming Architecture with Apache Flink by Robert Metzger
Leveraging blockchain technologies for the internet of things by Sam Bessalah
Introducing Kafka Streams, the new stream processing library of Apache Kafka by Michael Noll
Fast Analytics on Fast Data by Todd Lipcon
Fast Cars, Big Data – How Streaming Can Help Formula 1 by Ted Dunning

Upcoming Events

6/13/2016 (Amsterdam, NL) – GOTO Night: Stream Processing with Apache Flink and Mining Github (Meetup)
6/13/2016 (New York, US) – Apache Beam (Stream Processing @ Scale Track at QCon New York)
6/13/21016 (Dublin, IR) – Production Quality Data Science. Building Rapid Ingestion Data Pipelines (Meetup)
6/14/2016 (Garden City, US) – Moving data gracefully with Apache NiFi
6/14/2016 (Lausanne, CH) – Apache Kafka A high-throughput distributed messaging system (Meetup)
6/15/2016 (Mountain View, US) – Stream Processing Meetup @ LinkedIn (Meetup)
6/15/2016 (London, UK) – Hortonworks Dataflow (HDF) Meetup London (Meetup)
6/16/2016 (Garden City, US) – Moving data gracefully with Apache NiFi (Meetup)
6/21/2016 (San Jose, US) – Apache Bigtop & Apache Apex (Meetup)
6/21/2016 (online) – A New Paradigm for Managing Data in Motion (Webinar)
6/22/2016 (online) – Comparing Open Source Big Data Ingest Options (Streamsets Webinar)
6/27/2016 (San Jose, US) – Building Big Data applications with Apache Beam and Apache Apex (Meetup)
7/5/2016 (San Francisco, US) – Building (and running) Netflix’s Data Pipeline using Apache Kafka (Meetup)
7/6/2016 (Atlanta, US) – StreamSets, For The Coding Minimalist In All of Us (Meetup)
7/14/2016 (Princeton, US) – Apache NiFi (Meetup)
7/19/2016 (Munich, GE) – Apache Apex: Stream Processing Architecture and Applications (Meetup)
7/19/2016 (Taipei, TW) – Stream Processing with Apache Flink (Meetup)
8/18/2016 (New York, US) – Apache NiFi – MiNiFi: Taking Dataflow Management to the Edge (Meetup)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!

gschmutz 15:42 on June 12, 2016
Tags: docker ( 3 ), oracle stream analytics, osa ( 3 ), streaming-analytics ( 84 )

Providing Oracle Stream Analytics 12c environment using Docker

The past 2 days I spent some time to upgrade the docker support I have created for Oracle Stream Explorer to work for Oracle Stream Analytics (which is the new Oracle Stream Explorer).

I guess Docker I don’t have to present anymore, it’s so common today!

Preparation

You can find the corresponding docker project on my GitHub: https://github.com/gschmutz/dockerfiles

Due to the Oracle licensing agreement, the Oracle software itself can not be provided in the GitHub project. Therefore it’s also not possible to upload a built image to Docker Hub.

So you first have to download the Java 8 SDK as well as Stream Analytics Runtime using your own OTN login. Download the following 2 artifacts into the oracle-stream-analytics/dockerfiles/12.2.1/downloads folder.

Java 8 Server: server-jre-8u91-linux-x64.tar.gz
Oracle Stream Analytics Runtime: ofm_integration_osa_12.2.1.0.0_disk1.zip

Building the Oracle Stream Analytics Docker Install image

Navigate to the dockerfiles folder and run the buildDockerImage.sh script as root

$ sh buildDockerImage.sh -v 12.2.1 -A

This will take a while if run for the first time, as it downloads the oracle-linux base image first. At the end you should see a message similar to the one below:

  WebLogic Docker Image for 'standalone' version 12.2.1 is ready to be extended: 
    
    --> gschmutz/oracle-osa:12.2.1-standalone

  Build completed in 171 seconds.

It indicates that the OSA base docker image has been built successfully.

Be aware: this image is not yet executable, it only contains the software without any domain.

Building a Oracle Stream Analytics Standalone domain

In order to use Oracle Stream Analytics, we have to build a domain. This can be done using Docker as well, extending the Oracle Stream Analytics image created above and creating an OSA domain. Currently there is one sample Dockerfile available in the samples folder which creates an Oracle Stream Analytics Standalone domain. In the future this will be enhanced with a domain connecting to Spark.

To build the 12.2.1 standalone domain, navigate to folder samples/1221-domain and run the following command (use the OSA_PASSWORD parameter to specify the OSA user password):

$ docker build -t 1221-domain --build-arg OSA_PASSWORD=<define> .

There are other build arguments you can use to overwrite the default values of the Oracle Stream Analytics Standalone domain. They are documented in the GitHub project here.

Verify you now have this image in place with:

$ docker images

Running Oracle Stream Analytics server

To start the Oracle Stream Analytics server, you can simply call docker run -d 1221-domain command. The sample Dockerfile defines startwlevs.sh as the default CMD.

$ docker run -d --name=osa -p 9002:9002 1221-domain

Check the log by entering

$ docker logs -f osa

After a couple of seconds, the OSA server should be up and running and you can access the Oracle Stream Analytics Web Console at http://localhost:9002/sx.

Connect with user osaadmin and the password you specified above.

gschmutz 21:57 on May 10, 2016
Tags: kafka ( 248 ), oracle stream analytics, oracle stream explorer ( 2 ), spark ( 153 ), storm ( 46 ), Stream Processing ( 241 )

Last week in Stream Processing & Analytics 5/9/2016

This is the 13th installment of my blog series around Stream Processing and Analytics.

Last week the new release of Oracle Stream Explorer has been release, now under a new name Oracle Stream Analytics. I have written my own blog article about it. This new version is an impressive release with over 15 new major features! It really deserves the name change. Oracle Stream Analytics simplifies stream processing and enables Self Service Streaming Analytics applications for business people. It is based on the idea of a “streaming excel sheet”, allowing a business analyst to work in a way he is used from excel, but instead of working on static data, the data constantly changes based on the incoming stream(s).

For those not able to attend the Hadoop Summit in Dublin last month (like mysellf), all the sessions and slides are now available online for free!

Apart from that the week was a bit more quiet than previous weeks. As usual, find below the new blog articles, presentations, videos and software releases from last week:

News and Blog Posts

General

6 Splunk alternatives for log analysis by Serdar Yegulalp
Which freaking Hadoop engine should I use? by Andrew C. Oliver
Data Warehouse Design: A Move to Real Time by Chris Raphael
Three reasons why real-time streaming analytics will replace batch-and one reason why it won’t by Mary Ann Richardson
Fault Tolerance for Stream Processing Engines by Muhammad Anis Uddin Nasir
What is In-Stream Processing? by Sergei Trueber, Anton Ovchinnikov, Victoria Livschitz
How In-Stream Processing Works by Sergei Trueber, Anton Ovchinnikov, Victoria Livschitz
Overview of In-Stream Processing Solutions On the Market by Sergei Trueber, Anton Ovchinnikov, Victoria Livschitz
Event Stream Data: What is it and Why Should I Care? by Phil Simon
Hadoop Summit Dublin – The Power of Community by Mark Herring

Apache Storm

Apache Storm: How to configure KafkaBolt with Flux by Adriano Dadis
Apache Storm gegen SAP Smart Data Streaming (in German) by Björn Böttcher
Streamlining Distributed Stream Processing with SuperChief by Ted Carstensen

Apache Spark Streaming

Real Time Credit Card Fraud Detection with Apache Spark and Event Streaming by Carol McDonald

Apache Flink

RBEA: Scalable Real-Time Analytics at King by Gyula Fora and Mattias Andersson
8 breaking changes in Apache Flink 1.0.0 by Kumar Chinnakali

Apache Beam

Why Apache Beam? A Google Perspective by Taylor Akidau
Why Apache Beam? a data Artisans perspective by Kostas Tzoumas
Introduction to Apache Beam by Jean-Baptiste Onofre

Apache Apex

Apache Gets Yet Another Stream Processing Engine with Apex by Susan Hall

Apache Kafka

Log Compaction | Kafka Summit Edition | May 2016 by Gwen Shapira

StreamSets

May the 4th Be With You – Analyzing Star Wars Twitter Mentions in Minecraft by Pat Patterson

Microsoft Azure Stream Analytics

Azure Stream Analytics certified for ISO 27001, ISO 27018, HIPAA, and the EU Model Clauses by Ryan CrawCour

Oracle Stream Analytics

Oracle Stream Analytics (OSA): the new Oracle Stream Explorer by Guido Schmutz
Tapping into life – An Introduction to Stream Analytics by Jose Rodrigues

New Presentations

Slides from Hadoop Summit 2016 in Dublin
Getting basics right for your IoT business by Kay Lerch
Developing Connected Applications with AWS IoT by Adam Larter
Stream Computing (The Engineer’s Perspective) by Ilya Ganelin
Building a Real Time Dashboard with Amazon Kinesis, Amazon Lambda and Amazon Elasticsearch by Ganesh Raja
Microservices for a Streaming World by Ben Stopford
Extending The Yahoo Streaming Benchmark to Apache Apex by Sandesh Hedge
Introduction to Flink Streaming by Madhukara Phatak
Hybrid solution analysis of streaming sensor data with Spark Streaming & Kafka by Virender Thakur
Real Time Analytics: Algorithms and Systems by Arun Kejariwal
Big data cybersecurity by George Vetticaden & James Sirota

New Videos

Sessions from Hadoop Summit 2016 in Dublin
Webinar How to Achieve High Throughput for Real Time Applications with SMACK by Ryan Knight
Spark Streaming: Dealing with State by Francois Garillot
Streamlining Distributed Stream Processing with SuperChief by Ted Carstensen
Microservices for a Streaming World by Ben Stopford

New Releases

Upcoming Events

5/9/2016 (Santa Clara, US) – Make Streaming IoT Analytics Work For You: The Devil is in the Details (Meetup)
5/9/2016 (Vancouver, CA) – Streams Track (Big data 2016 conference)
5/10/2016 (New York, US) – The Big Heist: Attempting to steal distributed systems from complexity and chaos (Scala Days)
5/10/2016 (Vancouver, CA) – Using Kafka and Kudu for Fast, Low-Latency SQL Analytics on Streaming Data (Big data 2016 conference)
5/10/2016 (Flemington, US) – Apache NiFi – Deep Dive (Meetup)
5/10/2016 (online) – Reactive Stream Processing with Akka (TechGig Webinar)
5/11/2016 (New York, US) – Apache Storm 1.0 with Taylor Goetz (Meetup)
5/16-20/2 (San Francisco, US) – Pileplines By the Bay (Pipelines By the Bay Conference)
5/16/2016 (online) – Productionizing your Streaming Jobs (Databricks Webinar)
5/20/2016 (Madrid, SP) – Workshop Apache Flink (Meetup)
6/1-3/2016 (London, UK) – Real-time conference sessions (Strata + Hadoop World)
6/5-7/2016 (Berlin, GE) – Berlin Buzzwords
6/13/2016 (Amsterdam, NL) – GOTO Night: Stream Processing with Apache Flink and Mining Github (Meetup)
6/13/2016 (New York, US) – Apache Beam (Stream Processing @ Scale Track at QCon New York)

Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!