ﻻ يوجد ملخص باللغة العربية
We study the problem of concealing functionality of a proprietary or private module when provenance information is shown over repeated executions of a workflow which contains both `public and `private modules. Our approach is to use `provenance views to hide carefully chosen subsets of data over all executions of the workflow to ensure G-privacy: for each private module and each input x, the modules output f(x) is indistinguishable from G -1 other possible values given the visible data in the workflow executions. We show that G-privacy cannot be achieved simply by combining solutions for individual private modules; data hiding must also be `propagated through public modules. We then examine how much additional data must be hidden and when it is safe to stop propagating data hiding. The answer depends strongly on the workflow topology as well as the behavior of public modules on the visible data. In particular, for a class of workflows (which include the common tree and chain workflows), taking private solutions for each private module, augmented with a `public closure that is `upstream-downstream safe, ensures G-privacy. We define these notions formally and show that the restrictions are necessary. We also study the related optimization problems of minimizing the amount of hidden data.
Scientific workflow systems increasingly store provenance information about the module executions used to produce a data item, as well as the parameter settings and intermediate data items passed between module executions. However, authors/owners of
As users become confronted with a deluge of provenance data, dedicated techniques are required to make sense of this kind of information. We present Aggregation by Provenance Types, a provenance graph analysis that is capable of generating provenance
The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations
Provenance is information recording the source, derivation, or history of some information. Provenance tracking has been studied in a variety of settings; however, although many design points have been explored, the mathematical or semantic foundatio
Security is likely becoming a critical factor in the future adoption of provenance technology, because of the risk of inadvertent disclosure of sensitive information. In this survey paper we review the state of the art in secure provenance, consideri