Project "Data Streaming"

[Members] [Support] [Overview] [Long Description]


  • Christian Sohler
  • Christiane Lammersen
  • Morteza Monemizadeh
  • André Gronemeier


Research supported by Deutsche Forschungsgemeinschaft, grant So 514/1-2.


A data stream consists of a long sequence of data items, whose length restricts the amount of resources that is available to process the data and the kind of access to the data. In general, the amount of data is too large to store it in main memory. Often it is even larger than the capacity of modern hard disks. As a result, the data has to be processed on the fly, and the only possible access to the data is sequential reading. Typical examples of data streams are network traffic data, measurements of sensor networks or webcrawls. The task of a streaming algorithm is to read the data in one or a few passes and to maintain, at any point of time, a sketch of the already seen data. Usually, a streaming algorithm is only allowed to use local memory that is polylogarithmic in the size of the input stream, and it is only allowed to perform one or a polylogarithmic number of passes.

The aim of the project is to develop algorithms in geometric streaming models and to analyze the relations between geometric streaming models. We subdivide our project into the following areas:
  • clustering
  • metric embeddings
  • analysis of huge graphs
  • decision tree learning
  • relations between geometric streaming models


Morteza Monemizadeh, David P. Woodruff: 1-Pass Relative-Error Lp-Sampling with Applications. In Proceedings of SODA 2010, pp. 1143-1160, 2010.

Letzte Änderung am 28.10.2009 von A. Gronemeier