ترغب بنشر مسار تعليمي؟ اضغط هنا

A Core Calculus for Provenance

118   0   0.0 ( 0 )
 نشر من قبل James Cheney
 تاريخ النشر 2013
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Provenance is an increasing concern due to the ongoing revolution in sharing and processing scientific data on the Web and in other computer systems. It is proposed that many computer systems will need to become provenance-aware in order to provide satisfactory accountability, reproducibility, and trust for scientific or other high-value data. To date, there is not a consensus concerning appropriate formal models or security properties for provenance. In previous work, we introduced a formal framework for provenance security and proposed formal definitions of properties called disclosure and obfuscation. In this article, we study refined notions of positive and negative disclosure and obfuscation in a concrete setting, that of a general-purpose programing language. Previous models of provenance have focused on special-purpose languages such as workflows and database queries. We consider a higher-order, functional language with sums, products, and recursive types and functions, and equip it with a tracing semantics in which traces themselves can be replayed as computations. We present an annotation-propagation framework that supports many provenance views over traces, including standard forms of provenance studied previously. We investigate some relationships among provenance views and develop some partial solutions to the disclosure and obfuscation problems, including correct algorithms for disclosure and positive obfuscation based on trace slicing.



قيم البحث

اقرأ أيضاً

Provenance is information about the origin, derivation, ownership, or history of an object. It has recently been studied extensively in scientific databases and other settings due to its importance in helping scientists judge data validity, quality a nd integrity. However, most models of provenance have been stated as ad hoc definitions motivated by informal concepts such as comes from, influences, produces, or depends on. These models lack clear formalizations describing in what sense the definitions capture these intuitive concepts. This makes it difficult to compare approaches, evaluate their effectiveness, or argue about their validity. We introduce provenance traces, a general form of provenance for the nested relational calculus (NRC), a core database query language. Provenance traces can be thought of as concrete data structures representing the operational semantics derivation of a computation; they are related to the traces that have been used in self-adjusting computation, but differ in important respects. We define a tracing operational semantics for NRC queries that produces both an ordinary result and a trace of the execution. We show that three pre-existing forms of provenance for the NRC can be extracted from provenance traces. Moreover, traces satisfy two semantic guarantees: consistency, meaning that the traces describe what actually happened during execution, and fidelity, meaning that the traces explain how the expression would behave if the input were changed. These guarantees are much stronger than those contemplated for previous approaches to provenance; thus, provenance traces provide a general semantic foundation for comparing and unifying models of provenance in databases.
We introduce a new diagrammatic notation for representing the result of (algebraic) effectful computations. Our notation explicitly separates the effects produced during a computation from the possible values returned, this way simplifying the extens ion of definitions and results on pure computations to an effectful setting. Additionally, we show a number of algebraic and order-theoretic laws on diagrams, this way laying the foundations for a diagrammatic calculus of algebraic effects. We give a formal foundation for such a calculus in terms of Lawvere theories and generic effects.
We define BioScapeL, a stochastic pi-calculus in 3D-space. A novel aspect of BioScapeL is that entities have programmable locations. The programmer can specify a particular location where to place an entity, or a location relative to the current loca tion of the entity. The motivation for the extension comes from the need to describe the evolution of populations of biochemical species in space, while keeping a sufficiently high level description, so that phenomena like diffusion, collision, and confinement can remain part of the semantics of the calculus. Combined with the random diffusion movement inherited from BioScape, programmable locations allow us to capture the assemblies of configurations of polymers, oligomers, and complexes such as microtubules or actin filaments. Further new aspects of BioScapeL include random translation and scaling. Random translation is instrumental in describing the location of new entities relative to the old ones. For example, when a cell secretes a hydronium ion, the ion should be placed at a given distance from the originating cell, but in a random direction. Additionally, scaling allows us to capture at a high level events such as division and growth; for example, daughter cells after mitosis have half the size of the mother cell.
We study a dependently typed extension of a multi-stage programming language `a la MetaOCaml, which supports quasi-quotation and cross-stage persistence for manipulation of code fragments as first-class values and an evaluation construct for executio n of programs dynamically generated by this code manipulation. Dependent types are expected to bring to multi-stage programming enforcement of strong invariant -- beyond simple type safety -- on the behavior of dynamically generated code. An extension is, however, not trivial because such a type system would have to take stages of types -- roughly speaking, the number of surrounding quotations -- into account. To rigorously study properties of such an extension, we develop $lambda^{MD}$, which is an extension of Hanada and Igarashis typed calculus $lambda^{triangleright%} $ with dependent types, and prove its properties including preservation, confluence, strong normalization for full reduction, and progress for staged reduction. Motivated by code generators that generate code whose type depends on a value from outside of the quotations, we argue the significance of cross-stage persistence in dependently typed multi-stage programming and certain type equivalences that are not directly derived from reduction rules.
We extend the simply-typed guarded $lambda$-calculus with discrete probabilities and endow it with a program logic for reasoning about relational properties of guarded probabilistic computations. This provides a framework for programming and reasonin g about infinite stochastic processes like Markov chains. We demonstrate the logic sound by interpreting its judgements in the topos of trees and by using probabilistic couplings for the semantics of relational assertions over distributions on discrete types. The program logic is designed to support syntax-directed proofs in the style of relational refinement types, but retains the expressiveness of higher-order logic extended with discrete distributions, and the ability to reason relationally about expressions that have different types or syntactic structure. In addition, our proof system leverages a well-known theorem from the coupling literature to justify better proof rules for relational reasoning about probabilistic expressions. We illustrate these benefits with a broad range of examples that were beyond the scope of previous systems, including shift couplings and lump couplings between random walks.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا