|Location:||MSRI: Simons Auditorium|
In the 'frequent items' problem one sees a sequence of items in a stream (e.g. a stream of words coming into a search query engine like Google) and wants to report a small list of items containing all frequent items. In the 'change detection' problem one sees two streams, say one from yesterday and one from today, and wants to report a small list of items containing all those whose frequencies changed significantly. For both of these problems, we would like algorithms that use memory substantially sublinear in the length of the stream.
We describe new state-of-the-art solutions to both problems. For the former, we make use of ideas familiar to this community such as chaining methods to control the suprema of Rademacher processes to develop an algorithm BPTree with provably near-optimal memory consumption. For the latter, ideas familiar to this community also pop up (e.g. isoperimetry and Cheeger's inequality) in the course of developing the first algorithm to simultaneously achieve asymptotically optimal space, fast query and update time, and high success probability (over the random coins flipped by the algorithm).
Based on joint works with Vladimir Braverman, Stephen Chestnut, Nikita Ivkin, Kasper Green Larsen, Huy Le Nguyen, Mikkel Thorup, Zhengyu Wang, and David P. WoodruffNo Notes/Supplements Uploaded No Video Files Uploaded