Accelerated Data-Flow Analysis

479 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jerome Leroux

تاريخ النشر 2008

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jer^ome Leroux

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Acceleration in symbolic verification consists in computing the exact effect of some control-flow loops in order to speed up the iterative fix-point computation of reachable states. Even if no termination guarantee is provided in theory, successful results were obtained in practice by different tools implementing this framework. In this paper, the acceleration framework is extended to data-flow analysis. Compared to a classical widening/narrowing-based abstract interpretation, the loss of precision is controlled here by the choice of the abstract domain and does not depend on the way the abstract value is computed. Our approach is geared towards precision, but we dont loose efficiency on the way. Indeed, we provide a cubic-time acceleration-based algorithm for solving interval constraints with full multiplication.

قيم البحث

95 - Justin Y. Chen , Gregory Valiant , Paul Valiant 2019

We introduce a framework for statistical estimation that leverages knowledge of how samples are collected but makes no distributional assumptions on the data values. Specifically, we consider a population of elements $[n]={1,ldots,n}$ with correspond ing data values $x_1,ldots,x_n$. We observe the values for a sample set $A subset [n]$ and wish to estimate some statistic of the values for a target set $B subset [n]$ where $B$ could be the entire set. Crucially, we assume that the sets $A$ and $B$ are drawn according to some known distribution $P$ over pairs of subsets of $[n]$. A given estimation algorithm is evaluated based on its worst-case, expected error where the expectation is with respect to the distribution $P$ from which the sample $A$ and target sets $B$ are drawn, and the worst-case is with respect to the data values $x_1,ldots,x_n$. Within this framework, we give an efficient algorithm for estimating the target mean that returns a weighted combination of the sample values--where the weights are functions of the distribution $P$ and the sample and target sets $A$, $B$--and show that the worst-case expected error achieved by this algorithm is at most a multiplicative $pi/2$ factor worse than the optimal of such algorithms. The algorithm and proof leverage a surprising connection to the Grothendieck problem. This framework, which makes no distributional assumptions on the data values but rather relies on knowledge of the data collection process, is a significant departure from typical estimation and introduces a uniform algorithmic analysis for the many natural settings where membership in a sample may be correlated with data values, such as when sampling probabilities vary as in importance sampling, when individuals are recruited into a sample via a social network as in snowball sampling, or when samples have chronological structure as in selective prediction.

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي

Semi-parametric Models for Accelerated Destructive Degradation Test Data Analysis

302 - Yimeng Xie , Caleb B. King , Yili Hong 2015

Accelerated destructive degradation tests (ADDT) are widely used in industry to evaluate materials long term properties. Even though there has been tremendous statistical research in nonparametric methods, the current industrial practice is still to use application-specific parametric models to describe ADDT data. The challenge of using a nonparametric approach comes from the need to retain the physical meaning of degradation mechanisms and also perform extrapolation for predictions at the use condition. Motivated by this challenge, we propose a semi-parametric model to describe ADDT data. We use monotonic B-splines to model the degradation path, which not only provides flexible models with few assumptions, but also retains the physical meaning of degradation mechanisms (e.g., the degradation path is monotonically decreasing). Parametric models, such as the Arrhenius model, are used for modeling the relationship between the degradation and accelerating variable, allowing for extrapolation to the use conditions. We develop an efficient procedure to estimate model parameters. We also use simulation to validate the developed procedures and demonstrate the robustness of the semi-parametric model under model misspecification. Finally, the proposed method is illustrated by multiple industrial applications.

المنهجية

Information Preserving Component Analysis: Data Projections for Flow Cytometry Analysis

146 - Kevin M. Carter , Raviv Raich , William G. Finn 2008

Flow cytometry is often used to characterize the malignant cells in leukemia and lymphoma patients, traced to the level of the individual cell. Typically, flow cytometric data analysis is performed through a series of 2-dimensional projections onto t he axes of the data set. Through the years, clinicians have determined combinations of different fluorescent markers which generate relatively known expression patterns for specific subtypes of leukemia and lymphoma -- cancers of the hematopoietic system. By only viewing a series of 2-dimensional projections, the high-dimensional nature of the data is rarely exploited. In this paper we present a means of determining a low-dimensional projection which maintains the high-dimensional relationships (i.e. information) between differing oncological data sets. By using machine learning techniques, we allow clinicians to visualize data in a low dimension defined by a linear combination of all of the available markers, rather than just 2 at a time. This provides an aid in diagnosing similar forms of cancer, as well as a means for variable selection in exploratory flow cytometric research. We refer to our method as Information Preserving Component Analysis (IPCA).

التعلم الالي

Improved Balanced Flow Computation Using Parametric Flow

104 - Omar Darwish , Kurt Mehlhorn 2015

We present a new algorithm for computing balanced flows in equality networks arising in market equilibrium computations. The current best time bound for computing balanced flows in such networks requires $O(n)$ maxflow computations, where $n$ is the number of nodes in the network [Devanur et al. 2008]. Our algorithm requires only a single parametric flow computation. The best algorithm for computing parametric flows [Gallo et al. 1989] is only by a logarithmic factor slower than the best algorithms for computing maxflows. Hence, the running time of the algorithms in [Devanur et al. 2008] and [Duan and Mehlhorn 2015] for computing market equilibria in linear Fisher and Arrow-Debreu markets improve by almost a factor of $n$.

بنى وهياكل البيانات والخوارزميات علوم الكمبيوتر ونظرية الألعاب

Data-flow Analysis of Programs with Associative Arrays

173 - David Hauzar (Department of Distributed , Dependable Systems , n Faculty of Mathematics 2014

Dynamic programming languages, such as PHP, JavaScript, and Python, provide built-in data structures including associative arrays and objects with similar semantics-object properties can be created at run-time and accessed via arbitrary expressions. While a high level of security and safety of applications written in these languages can be of a particular importance (consider a web application storing sensitive data and providing its functionality worldwide), dynamic data structures pose significant challenges for data-flow analysis making traditional static verification methods both unsound and imprecise. In this paper, we propose a sound and precise approach for value and points-to analysis of programs with associative arrays-like data structures, upon which data-flow analyses can be built. We implemented our approach in a web-application domain-in an analyzer of PHP code.

هندسة البرمجيات لغات البرمجة

سجل دخول لتتمكن من نشر تعليقات