ﻻ يوجد ملخص باللغة العربية
Given a stream $S = (s_1, s_2, ..., s_N)$, a $phi$-heavy hitter is an item $s_i$ that occurs at least $phi N$ times in $S$. The problem of finding heavy-hitters has been extensively studied in the database literature. In this paper, we study a related problem. We say that there is a $phi$-event at time $t$ if $s_t$ occurs exactly $phi N$ times in $(s_1, s_2, ..., s_t)$. Thus, for each $phi$-heavy hitter there is a single $phi$-event which occurs when its count reaches the reporting threshold $phi N$. We define the online event-detection problem (OEDP) as: given $phi$ and a stream $S$, report all $phi$-events as soon as they occur. Many real-world monitoring systems demand event detection where all events must be reported (no false negatives), in a timely manner, with no non-events reported (no false positives), and a low reporting threshold. As a result, the OEDP requires a large amount of space (Omega(N) words) and is not solvable in the streaming model or via standard sampling-based approaches. Since OEDP requires large space, we focus on cache-efficient algorithms in the external-memory model. We provide algorithms for the OEDP that are within a log factor of optimal. Our algorithms are tunable: its parameters can be set to allow for a bounded false-positives and a bounded delay in reporting. None of our relaxations allow false negatives since reporting all events is a strict requirement of our applications. Finally, we show improved results when the count of items in the input stream follows a power-law distribution.
We consider the online Min-Sum Set Cover (MSSC), a natural and intriguing generalization of the classical list update problem. In Online MSSC, the algorithm maintains a permutation on $n$ elements based on subsets $S_1, S_2, ldots$ arriving online. T
We study emph{parallel} online algorithms: For some fixed integer $k$, a collective of $k$ parallel processes that perform online decisions on the same sequence of events forms a $k$-emph{copy algorithm}. For any given time and input sequence, th
We study the online maximum coverage problem on a line, in which, given an online sequence of sub-intervals (which may intersect among each other) of a target large interval and an integer $k$, we aim to select at most $k$ of the sub-intervals such t
We consider the file maintenance problem (also called the online labeling problem) in which n integer items from the set {1,...,r} are to be stored in an array of size m >= n. The items are presented sequentially in an arbitrary order, and must be st
In this paper, we strengthen the competitive analysis results obtained for a fundamental online streaming problem, the Frequent Items Problem. Additionally, we contribute with a more detailed analysis of this problem, using alternative performance me