No Arabic abstract
Many context-sensitive data flow analyses can be formulated as a variant of the all-pairs Dyck-CFL reachability problem, which, in general, is of sub-cubic time complexity and quadratic space complexity. Such high complexity significantly limits the scalability of context-sensitive data flow analysis and is not affordable for analyzing large-scale software. This paper presents textsc{Flare}, a reduction from the CFL reachability problem to the conventional graph reachability problem for context-sensitive data flow analysis. This reduction allows us to benefit from recent advances in reachability indexing schemes, which often consume almost linear space for answering reachability queries in almost constant time. We have applied our reduction to a context-sensitive alias analysis and a context-sensitive information-flow analysis for C/C++ programs. Experimental results on standard benchmarks and open-source software demonstrate that we can achieve orders of magnitude speedup at the cost of only moderate space to store the indexes. The implementation of our approach is publicly available.
Due to time constraints, course instructors often need to selectively participate in student discussion threads, due to their limited bandwidth and lopsided student--instructor ratio on online forums. We propose the first deep learning models for this binary prediction problem. We propose novel attention based models to infer the amount of latent context necessary to predict instructor intervention. Such models also allow themselves to be tuned to instructors preference to intervene early or late. Our three proposed attentive model variants to infer the latent context improve over the state-of-the-art by a significant, large margin of 11% in F1 and 10% in recall, on average. Further, introspection of attention help us better understand what aspects of a discussion post propagate through the discussion thread that prompts instructor intervention.
As a key component in a dialogue system, dialogue state tracking plays an important role. It is very important for dialogue state tracking to deal with the problem of unknown slot values. As far as we known, almost all existing approaches depend on pointer network to solve the unknown slot value problem. These pointer network-based methods usually have a hidden assumption that there is at most one out-of-vocabulary word in an unknown slot value because of the character of a pointer network. However, often, there are multiple out-of-vocabulary words in an unknown slot value, and it makes the existing methods perform bad. To tackle the problem, in this paper, we propose a novel Context-Sensitive Generation network (CSG) which can facilitate the representation of out-of-vocabulary words when generating the unknown slot value. Extensive experiments show that our proposed method performs better than the state-of-the-art baselines.
In the sensitive distance oracle problem, there are three phases. We first preprocess a given directed graph $G$ with $n$ nodes and integer weights from $[-W,W]$. Second, given a single batch of $f$ edge insertions and deletions, we update the data structure. Third, given a query pair of nodes $(u,v)$, return the distance from $u$ to $v$. In the easier problem called sensitive reachability oracle problem, we only ask if there exists a directed path from $u$ to $v$. Our first result is a sensitive distance oracle with $tilde{O}(Wn^{omega+(3-omega)mu})$ preprocessing time, $tilde{O}(Wn^{2-mu}f^{2}+Wnf^{omega})$ update time, and $tilde{O}(Wn^{2-mu}f+Wnf^{2})$ query time where the parameter $muin[0,1]$ can be chosen. The data-structure requires $O(Wn^{2+mu} log n)$ bits of memory. This is the first algorithm that can handle $fgelog n$ updates. Previous results (e.g. [Demetrescu et al. SICOMP08; Bernstein and Karger SODA08 and FOCS09; Duan and Pettie SODA09; Grandoni and Williams FOCS12]) can handle at most 2 updates. When $3le flelog n$, the only non-trivial algorithm was by [Weimann and Yuster FOCS10]. When $W=tilde{O}(1)$, our algorithm simultaneously improves their preprocessing time, update time, and query time. In particular, when $f=omega(1)$, their update and query time is $Omega(n^{2-o(1)})$, while our update and query time are truly subquadratic in $n$, i.e., ours is faster by a polynomial factor of $n$. To highlight the technique, ours is the first graph algorithm that exploits the kernel basis decomposition of polynomial matrices by [Jeannerod and Villard J.Comp05; Zhou, Labahn and Storjohann J.Comp15] developed in the symbolic computation community. [...]
Context-sensitive global analysis of large code bases can be expensive, which can make its use impractical during software development. However, there are many situations in which modifications are small and isolated within a few components, and it is desirable to reuse as much as possible previous analysis results. This has been achieved to date through incremental global analysis fixpoint algorithms that achieve cost reductions at fine levels of granularity, such as changes in program lines. However, these fine-grained techniques are not directly applicable to modular programs, nor are they designed to take advantage of modular structures. This paper describes, implements, and evaluates an algorithm that performs efficient context-sensitive analysis incrementally on modular partitions of programs. The experimental results show that the proposed modular algorithm shows significant improvements, in both time and memory consumption, when compared to existing non-modular, fine-grain incremental analysis techniques. Furthermore, thanks to the proposed inter-modular propagation of analysis information, our algorithm also outperforms traditional modular analysis even when analyzing from scratch.
Context: Concurrent objects with asynchronous messaging are an increasingly popular way to structure highly available, high performance, large-scale software systems. To ensure data-consistency and support synchronization between objects such systems often use distributed transactions with Two-Phase Locking (2PL) for concurrency control and Two-Phase commit (2PC) as atomic commitment protocol. Inquiry In highly available, high-throughput systems, such as large banking infrastructure, however, 2PL becomes a bottleneck when objects are highly contended, when an object is queuing a lot of messages because of locking. Approach: In this paper we introduce Path-Sensitive Atomic Commit (PSAC) to address this situation. We start from message handlers (or methods), which are decorated with pre- and post-conditions, describing their guards and effect. Knowledge: This allows the PSAC lock mechanism to check whether the effect of two incoming messages at the same time are independent, and to avoid locking if this is the case. As a result, more messages are directly accepted or rejected, and higher overall throughput is obtained. Grounding: We have implemented PSAC for a state machine-based DSL called Rebel, on top of a runtime based on the Akka actor framework. Our performance evaluation shows that PSAC exhibits the same scalability and latency characteristics as standard 2PL/2PC, and obtains up to 1.8 times median higher throughput in congested scenarios. Importance: We believe PSAC is a step towards enabling organizations to build scalable distributed applications, even if their consistency requirements are not embarrassingly parallel.