Sliding window order statistics in sublinear space

309 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Dhruv Rohatgi

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Dhruv Rohatgi

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We extend the multi-pass streaming model to sliding window problems, and address the problem of computing order statistics on fixed-size sliding windows, in the multi-pass streaming model as well as the closely related communication complexity model. In the $2$-pass streaming model, we show that on input of length $N$ with values in range $[0,R]$ and a window of length $K$, sliding window minimums can be computed in $widetilde{O}(sqrt{N})$. We show that this is nearly optimal (for any constant number of passes) when $R geq K$, but can be improved when $R = o(K)$ to $widetilde{O}(sqrt{NR/K})$. Furthermore, we show that there is an $(l+1)$-pass streaming algorithm which computes $l^text{th}$-smallest elements in $widetilde{O}(l^{3/2} sqrt{N})$ space. In the communication complexity model, we describe a simple $widetilde{O}(pN^{1/p})$ algorithm to compute minimums in $p$ rounds of communication for odd $p$, and a more involved algorithm which computes the $l^text{th}$-smallest elements in $widetilde{O}(pl^2 N^{1/(p-2l-1)})$ space. Finally, we prove that the majority statistic on boolean streams cannot be computed in sublinear space, implying that $l^text{th}$-smallest elements cannot be computed in space both sublinear in $N$ and independent of $l$.

قيم البحث

76 - Zhengyuan Liu , Nancy F. Chen 2021

Recently abstractive spoken language summarization raises emerging research interest, and neural sequence-to-sequence approaches have brought significant performance improvement. However, summarizing long meeting transcripts remains challenging. Due to the large length of source contents and targeted summaries, neural models are prone to be distracted on the context, and produce summaries with degraded quality. Moreover, pre-trained language models with input length limitations cannot be readily applied to long sequences. In this work, we first analyze the linguistic characteristics of meeting transcripts on a representative corpus, and find that the sentences comprising the summary correlate with the meeting agenda. Based on this observation, we propose a dynamic sliding window strategy for meeting summarization. Experimental results show that performance benefit from the proposed method, and outputs obtain higher factual consistency than the base model.

الحساب واللغة

Sublinear Estimation of Weighted Matchings in Dynamic Data Streams

353 - Marc Bury , Chris Schwiegelshohn 2015

This paper presents an algorithm for estimating the weight of a maximum weighted matching by augmenting any estimation routine for the size of an unweighted matching. The algorithm is implementable in any streaming model including dynamic graph strea ms. We also give the first constant estimation for the maximum matching size in a dynamic graph stream for planar graphs (or any graph with bounded arboricity) using $tilde{O}(n^{4/5})$ space which also extends to weighted matching. Using previous results by Kapralov, Khanna, and Sudan (2014) we obtain a $mathrm{polylog}(n)$ approximation for general graphs using $mathrm{polylog}(n)$ space in random order streams, respectively. In addition, we give a space lower bound of $Omega(n^{1-varepsilon})$ for any randomized algorithm estimating the size of a maximum matching up to a $1+O(varepsilon)$ factor for adversarial streams.

بنى وهياكل البيانات والخوارزميات

Anomaly Detection in Road Networks Using Sliding-Window Tensor Factorization

89 - Ming Xu , Jianping Wu , Haohan Wang 2018

Anomaly detection in road networks is vital for traffic management and emergency response. However, existing approaches do not directly address multiple anomaly types. We propose a tensor-based spatio-temporal model for detecting multiple types of an omalies in road networks. First, we represent network traffic data as a 3rd-order tensor. Next, we acquire spatial and multi-scale temporal patterns of traffic variations via a novel, computationally efficient tensor factorization algorithm: sliding window tensor factorization. Then, from the factorization results, we can identify different anomaly types by measuring deviations from different spatial and temporal patterns. Finally, we discover path-level anomalies by formulating anomalous path inference as a linear program that solves for the best matched paths of anomalous links. We evaluate the proposed methods via both synthetic experiments and case studies based on a real-world vehicle trajectory dataset, demonstrating advantages of our approach over baselines.

الفيزياء والمجتمع الشبكات الاجتماعية والمعلومات

Token Sliding on Split Graphs

167 - Remy Belmonte , Eun Jung Kim , Michael Lampis 2018

We consider the complexity of the Independent Set Reconfiguration problem under the Token Sliding rule. In this problem we are given two independent sets of a graph and are asked if we can transform one to the other by repeatedly exchanging a vertex that is currently in the set with one of its neighbors, while maintaining the set independent. Our main result is to show that this problem is PSPACE-complete on split graphs (and hence also on chordal graphs), thus resolving an open problem in this area. We then go on to consider the $c$-Colorable Reconfiguration problem under the same rule, where the constraint is now to maintain the set $c$-colorable at all times. As one may expect, a simple modification of our reduction shows that this more general problem is PSPACE-complete for all fixed $cge 1$ on chordal graphs. Somewhat surprisingly, we show that the same cannot be said for split graphs: we give a polynomial time ($n^{O(c)}$) algorithm for all fixed values of $c$, except $c=1$, for which the problem is PSPACE-complete. We complement our algorithm with a lower bound showing that $c$-Colorable Reconfiguration is W[2]-hard on split graphs parameterized by $c$ and the length of the solution, as well as a tight ETH-based lower bound for both parameters.

بنى وهياكل البيانات والخوارزميات

Sublinear-Time Algorithms for Compressive Phase Retrieval

180 - Yi Li , Vasileios Nakos 2017

In the compressive phase retrieval problem, or phaseless compressed sensing, or compressed sensing from intensity only measurements, the goal is to reconstruct a sparse or approximately $k$-sparse vector $x in mathbb{R}^n$ given access to $y= |Phi x| $, where $|v|$ denotes the vector obtained from taking the absolute value of $vinmathbb{R}^n$ coordinate-wise. In this paper we present sublinear-time algorithms for different variants of the compressive phase retrieval problem which are akin to the variants considered for the classical compressive sensing problem in theoretical computer science. Our algorithms use pure combinatorial techniques and near-optimal number of measurements.

بنى وهياكل البيانات والخوارزميات نظرية المعلومات نظرية المعلومات

سجل دخول لتتمكن من نشر تعليقات