PUBLICATIONS
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
ABSTRACT
The 𝜖 -approximate 𝜙 -heavy hitters problem is, for any element from some universe 𝕌=[0..𝑛) , to maintain its frequency under an arbitrary data stream of form (𝑥𝑖,𝛥𝑖)∈𝕌×ℤ that changes the frequency of 𝑥𝑖 by 𝛥𝑖 , such that one can output every element with frequency more than 𝜙𝑁 and no element with frequency no more than (𝜙−𝜖)𝑁 for 𝑁=∑𝑖𝛥𝑖 and prespecified parameters 𝜖,𝜙∈ℝ . To solve this problem in small space, Cormode and Muthukrishnan (ACM TODS, 2005) have proposed an 𝑂(𝜌𝜖−1lg𝑛) -space probabilistic data structure with good practical performance, where 𝜌=lg(1/(𝛿𝜙)) for any failure probability 𝛿∈ℝ . In this paper, we improve its output time from 𝑂(𝜌𝜖−1(lg𝑛+𝜌)) to 𝑂(𝜌2𝜖−1) for arbitrary updates ( 𝛥𝑖∈ℤ ) and its update time from 𝑂(𝜌lg𝑛) to amortized 𝑂(𝜌) for constant updates ( 𝛥𝑖∈𝑂(1) ) with the same space and output guarantee by removing application-specific lg𝑛 terms that are not tunable, unlike other parameters 𝛿 , 𝜖 , and 𝜙 .