This is the 6th installment of my blog series around Stream Processing and Analytics.
This week I would like to quote two articles, one from Forbes Tech and another from Information Week.
In the 17 Predictions about the future of Big Data … Forbes compiles 17 predictions from well known analysts, mentioning fast data as one of the next trends:
- “real-time streaming insights into data will be the hallmarks of data winners going forward”
- “autonomous agents and things will continue to be a huge trends”
- “Fast Data and actionable data will replace big data”
According to this article, Forrester defines streaming analytics software as technology that “can filter, aggregate, enrich, and analyze a high throughput of data from multiple disparate live data sources and in any data format to identify simple and complex patterns to visualize business in real-time, detect urgent situations, and automate immediate actions.“.Forrester also identified key vendors and technology for streaming analytics, which are Apache Spark Streaming, Apache Storm, Data Torrent, IBM, Informatica, SAP, Software AG, SQLstream, Strim (WebAction), TIBCO, and Vitria.
I’m wondering why Oracle’s Stream Explorer product is not on Forrester’s radar. Especially because it can be used for both of the two Big Data’s priorities: Streaming Analytics and Self Service.
As usual, just find what I have noticed last week:
News and Blog Posts
General
- 7 conditions for stream processing by Dana Sandu
- Six Best Practices for Real-Time Analytics by Christy Pettey (Gartner)
- Making IoT Payoff With Big Data Streaming Analytics by John Haddad
- The IoT is a natural ecosystem for streaming analytics by Timothy McGovern
- Batch vs. Stream Processing by Luis Claudio Rodrigues da Silveira
- Big Data’s Priorities: Streaming Analytics, Self-Service by Jessica Davis
- Closed Loop: Machine Learning and Real Time Stream Processing by Kai Wähner
- Business Goes for Streaming Analytics by the Zorba Research Blog
- 17 Predictions About The Future Of Big Data Everyone Should Read by Bernard Marr
Comparison
Apache Spark Streaming
- The Art of Aggregation With Kafka and Spark: Part 1 by Patrick Shields
- The Art of Aggregation With Kafka and Spark: Part 2 by Patrick Shields
Apache NiFi / Hortonworks DataFlow
- ExecuteScript – JSON-to-JSON Revisited (with Jython) by Matt Burgess
- Twitter Tweets – Apache Nifi combined with MongoDb, Groovy, Velocity and Highcharts by Uwe Geercken
Apache Kafka
- Message Hub: Apache Kafka as a Service by Rajini
Goggle Cloud Dataflow / Apache Beam
- Clarifying & Formalizing Runner Capabilities by Frances Perry & Tyler Akidau
- Apache Beam Capability Matrix by Apache Beam documentation
Hortonworks DataFlow / Apache NiFi
StreamSets
- Visualizing Apache Log Data… with Minecraft! by Pat Patterson
- What’s the Biggest Lot in the City of San Francisco? by Pat Patterson
- New Release of StreamSets Data Collector Takes on Data Drift by Datafloq
- StreamSets Achieves Certification With MapR Converged Data Platform With Extended Support for MapR Streams by StreamSets blog
- How Trend Micro Uses StreamSets – An Interview with the Threat Research Team by Kirit Basu
IBM Quarks
- Data Streaming at the Edge: IBM and Apache Quarks by Jusin Grammens
Microsoft Azure Stream Analytics
- Innovative Azure Stream Analytics customer use cases by Siddhika Nevrekar
- Stream Analytics Data Lake Store output by Jeff Stokes
- SQL Data Warehouse as output of Azure Stream Analytics by Kati Iceva
SAS Event Stream Processing
- Hortonworks HDP and SAS Event Stream Processing together, using YARN by Mark Lochbihler
New Presentations
- Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spark Streaming by Evan Chan and Helena Edelson
- Introduction to Apache Beam by Jean-Baptisste Onofré
- Real time data viz with Spark Streaming, Kafka and D3.js by Ben Laird
New Videos
- Building Real-Time Data Pipelines with Spark Streaming, Kafka, and Cassandra by Nik Rouda and Nanda Vijaydev
New Releases / Components
none
Upcoming Events
- 3/23/2016 (Chicago, US) – Real-Time Analytics on Customer Financial Activities With Apache Flink (Flink Meetup)
- 3/23/2016 (Online) – How to Effectively Manage IoT DataFlows with HDF (Hortonworks Webinar)
- 3/24/2016 (Hanover, US) – March NiFi Meetup (Meetup)
- 3/29/2016 (Chicago, US) – Battle Royale: Hadoop vs. Spark vs. Google Dataflow (ITA Tech Talk)
- 4/6/2016 (Berlin, UK) – Big Data & Real Time Analytics at Idealo.de ! (Meetup)
- 4/13/2016 (London, UK) – Real time search and insights with Apache Kafka (Meetup)
- 4/14/2016 (Chicago, US) – Jay Kreps: Apache Kafka and the Confluent Platform – Overview and Roadmap (Meetup)
- 4/18/2016 (Munich, GE) – Batch- & Stream-Processing mit Google Dataflow (Smart Data Developer Conference)
- 4/20/2016 (London, UK) – Kafka, Samza and HBase at the Home Office / Kafka as a Service on IBM’s Bluemix (Unified Log London Meetup)
- 4/21/2016 (Leidschendam, NL) – Apache Kafka: A guide to integrate and build data processing pipelines (Meetup)
- 4/25/2016 (San Francisco, US) – Kafka Summit
- 5/3/2016 (Flemington, US) – Apache NiFi – Deep Dive (Meetup)
- 5/10/2016 (New York, US) – The Big Heist: Attempting to steal distributed systems from complexity and chaos (Scala Days)
- 6/1-3/2016 (London, UK) – Real-time conference sessions (Strata + Hadoop World)
- 6/5-7/2016 (Berlin, GE) – Berlin Buzzwords
- 6/13/2016 (Amsterdam, NL) – GOTO Night: Stream Processing with Apache Flink and Mining Github (Meetup)
- 6/13/2016 (New York, US) – Apache Beam (Stream Processing @ Scale Track at QCon New York)
Please let me know if that is of interest. Please tweet your projects, blog posts, and meetups to @gschmutz to get them listed in next week’s edition!