Share

Streamer: a software suite for data stream analysis

Streamer is a complete software suite that enables data scientists to test ML algorithms on data streams. Streamer is open-source and multi-OS. It automates the entire data stream processing cycle and gives data scientists plenty of room to configure data streams and experiment in the most realistic conditions possible, all while minimizing setup time.

Use

Testing algorithms for data stream processing

Streamer is an automatic data stream processing suite that allows data scientists to test continuous ML algorithms in realistic conditions. It handles every step of the processing cycle, from data collection to visualization of results.

Streamer also spares data scientists the complex and time-consuming task of hand-developing the processing cycle, so that they can concentrate on perfecting their algorithms. The suite offers preparation and post-processing tools, advanced ML algorithms for streams, and evaluation tools. It also offers APIs that enable integration of third-party tools and algorithms written in a variety of programming languages, like Python, R, and Java. It also features a GUI that simplifies the monitoring of ML and data stream analysis processes.

Streamer’s code is open source (GNU GPL 3 license), allowing data scientists to modify it and add any desired features. However, the suite is also ready to use as-is in operational contexts.

Streamer was developed through collaboration between CEA-List and DAVID (a lab at Paris-Saclay University focused on sustainable, data-driven smart city projects) as part of the StreamOps project, financed by DATAIA.

A fully-customizable, open-source suite

Main advantages:

  • Multi-OS support—Linux, Windows, and Mac OS.
  • Modularity, giving users the freedom to configure data streams and define the contexts that their algorithms will work in.
  • Easy setup and use.
  • Tools for visual monitoring of learning processes and incoming data stream analysis.

We use Streamer, which we co-developed with CEA-List, in our ANR project Polluscope. The project aims to develop algorithms that can describe individuals’ air pollution exposure by leveraging a stream of measurement data collected by microsensors.

Karine Zeitouni

Professor at UVSQ and head of the ADAM group at the DAVID laboratory. — université Paris-Saclay

Continuous learning

Streamer facilitates the development of AI solutions implementing continuous learning.

  • Industry: rapidly detecting malfunctions and process drift.
  • Cybersecurity: identifying suspicious requests and quickly neutralizing them.
  • Health: monitoring patients and detecting risks.
Use cases

Streamlining AI solution development for cybersecurity

Cybersecurity experts increasingly count on ML algorithms for the evaluation and contextualization of threats and alerts. As part of an in-house research program, CEA-List demonstrated the advantages that Streamer brings to the creation and use of these algorithms:

  • During the learning phase, it offers a realistic environment for continuous learning based on data streams. It can enable precise evaluation of a model’s expected performance by fine-tuning the often-unpredictable variations in incoming data behavior that often come up during real operation.
  • During deployment, it guides production by providing unit testing to ensure proper algorithm integration—it makes sure that real-world performance matches what was simulated.
  • During operation, it provides methodologies and algorithms for monitoring algorithm behavior and identifying possible issues for correction.

Publications

STREAMER: a Powerful and Open-Source Framework for Continuous Learning in Data Streams, Garcia-Rodriguez, Sandra, Mohammad Alshaer, and Cedric Gouy-Pailler. Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 2020.

Detecting Anomalies from Streaming Time Series Using Matrix Profile and Shapelets Learning, Mohammad Alshaer, Sandra Garcia-Rodriguez and Cedric Gouy-Pailler. Proceedings of the 32th ACM International Conference on Tools with Artificial Intelligence. 2020.

Learn more: Streamer

See also

Use Cases

Smart maintenance

Smart maintenance continuously and automatically monitors industrial processes and equipment so that potential malfunctions can be predicted and prevented.
Read more
Technology platforms

Artificial intelligence

Support the responsible development of AI-based systems for industry and society.
Read more