IEEE Metro Area Smart Tech Workshop on Distributed Data Streaming Dec 5,2014 Paris

Ef

Home

Topic: Professional oriented workshop on Distributed Data Streaming

This one-day intensive discovery workshop on Distributed Data Streaming will be held on Dec 5, 2014 at ISEP in conference room L012 (ground floor), 10 rue de Vanves, 92130 Issy Les Moulineaux.

This workshop offers an Introduction to Distributed Data Streams, followed by two use case and 2 Parallel labs sessions based on ESPER and Storm tools that allow attendees to learn about specific techniques and technologies to process the data streams, in-memory computing, data stream processors.

Introduction to distributed Data streams

Speaker
Raja Chiky

Abstract
A growing number of applications in areas like utilities, retail industry and sensor networks deal with a challenging type of data: data elements arrive in multiple, continuous, rapid and time-varying data streams. A key requirement of such applications is to continuously monitor and react to interesting phenomena occurring in the input streams. The processing of these data streams requires the use of specific techniques and technologies that we will introduce in this presentation.

Biography
Raja Chiky is currently Associate Professor at ISEP where she is head of the RDI team (Research and development in Information Technology) and responsible for database and data mining courses. She holds a Ph.D. in Computer Science from Telecom ParisTech obtained after a Master degree in data mining and an engineering degree in computer science. Before joining ISEP, she taught statistics, databases and language programming at the University of Paris Dauphine, University Paris 12 and Telecom ParisTech. She worked closely with EDF R&D on research projects related to data stream mining. Her research interests include statistics, data mining, data stream management, recommender systems, and cloud computing.

Download
Slides

Supervised classification on data stream

Speaker
Vincent Lemaire

Abstract
Data-streams processing is a recent domain of research which is complementary to the Big Data. This kind of algorithms analyze data on the fly, and could be qualified as designed to treat ``Fast Data''. This talk aims at providing an overview of data-streams processing approaches for supervised classification problems. The outline will be : Classification using Big Data versus Classification on Stream Mining, Different forms of learning, Stream: what changes? Requirements for a good algorithm, Taxonomy of classifier for data stream, Leading classifiers, Concept drift, Evaluation, The two streams? A labeled stream?

Biography
Vincent Lemaire obtained his undergraduate degree from the University of Paris 12 in signal processing and was in the same period an Electronic Teacher. He obtained a PhD in Computer Science from the University of Paris 6 in 1999. He thereafter joined the R&D Division of France Télécom where he became a senior expert in data-mining. His research interests are the application of machine learning in various areas for telecommunication companies with an actual main application in data mining for business intelligence. He developed exploratory data analysis and classification interpretation tools. Incremental learning and clustering are now his main research interests. He obtained his Research Accreditation (HDR) in Computer Science from the University of Paris-Sud 11 (Orsay) in 2008.

Download
Slides

Use case 1: In-Memory Analytics is the financial Industry's secret weapon to turn data streams into actions

Speaker
Antoine Chambille

Abstract
Finance is an industry that keeps changing at a fast pace. It is one of the industries that is the most dependent on information systems: Traders buy, sell and hold risky financial products whose value depends on ever-changing interest rates, FX rates, stock prices...In short, they make operational decisions based on live data. They need a technology to run large and complex queries such as a data warehouse but that also operates fast and incrementally like a data stream management system.
In this presentation, we will examine the "hot" use cases in Capital Markets and disclose their "secret sauce": a new take on In-Memory Computing that combines online analysis with event processing.

Biography
Antoine Chambille is Head of Research and Development at Quartet FS. He joined Quartet FS soon after its creation back in 2005 and has been leading the team in charge of designing, developing and supporting Quartet FS’s in-memory analytics solutions. As one of the first employees, Antoine was heavily involved in the design of ActivePivot Server, Quartet FS’ in-memory OLAP engine.

Before joining Quartet FS, Antoine worked several years for a consulting firm specialized in the financial sector. From his years in consulting, he developed a strong customer orientation and he is keen on keeping a close eye on customers’ use cases. Antoine graduated from Ecole Polytechnique and Telecom Paris.

Download
Doc
Slides

Use case 2: Streaming energy data analytics

Speaker
Marie-Luce Picard

Abstract
For a utility such as EDF, real-time data analytics can address smart-grid issues (demand-response, dynamic pricing, real-time demand and production forecasting). It can also leverage customer relationship analyzing tweets for sentiment analysis. CEP (Complex Event Processing) tools emerged about 15 years ago allowing to query streaming data, join data in motion and data at rest or run on-line machine-learning algorithms. The talk will present different experiments and POCs run at EDF R&D showing the added value of using CEP tools to analyze streaming data.

Biography
Marie-Luce Picard is a project manager and BI expert at EDF R&D. She has managed different R&D projects dealing with business intelligence and information systems (advanced documentation systems, data-mining for customer insight teams, etc..) She has also managed the EDF R&D team working on BI and data analytics. She is currently in charge of managing the EDF R&D project dealing with Big Data to handle the evolutions of EDF information systems linked to the data deluge expected within a few years impacting all businesses of the Company.

Parallel labs sessions based on ESPER and Storm tools

Title
Real-Time Data processing with open source DSMS

Speakers
Raja Chiky and Sylvain Lefebvre

Abstract
Many solutions have been developed recently to process data streams that arrive from a single or multiple locations. The concept of Data Stream Management System (DSMS) has been introduced and several systems (including commercial ones) have been developed. These systems enable the processing of data streams on-the-fly and issue alarms or compute aggregates in real-time without storing all data on disk. The goal of the labs is to enable participants to practice the concepts learned in the first half day through the manipulation of open source DSMS as ESPER and STORM.

Biography
Sylvain Lefebvre is associate professor in the ISEP RDI team since January 2014. He obtained his PHD from CNAM, Paris in december 2013 after 3 years in working at ISEP. He also obtained his master's degree in distributed systems engineering from university Paris 12 (UPEC). He currently teaches web and java programming and has created a big data class for ISEP students. His research topics are mainly scalability and locality aware load balancing in the cloud, mainly for big data storage and processing systems.

Date: 11/12/2017, Time: 16:25.

© 2007 jletanou & rmorel © 2010 tea ISEP.