No Arabic abstract
Sharing provenance across workflow management systems automatically is not currently possible, but the value of such a capability is high since it could greatly reduce the amount of duplicated workflows, accelerate the discovery of new knowledge, and verify the integrity of past and present analyses. Although numerous technological challenges exist to efficiently share provenance information across workflow management systems, permissioned distributed ledgers could surmount many of them. The primary benefit of permissioned distributed ledgers over other technologies is that their distribution is over a peer-to-peer network that encodes transactions across the network into an immutable hash list and achieves consensus on the validity of the new data through a common consensus mechanism. This work discusses provenance and distributed ledgers on their own and then presents an argument that distributed ledgers naturally satisfy many of the requirements of workflow provenance, that provenance information can exist in the ledger in multiple ways, and that a number of novel research areas exist based on this strategy.
Advances in mobile computing have paved the way for new types of distributed applications that can be executed solely by mobile devices on device-to-device (D2D) ecosystems (e.g., crowdsensing). Sophisticated applications, like cryptocurrencies, need distributed ledgers to function. Distributed ledgers, such as blockchains and directed acyclic graphs (DAGs), employ consensus protocols to add data in the form of blocks. However, such protocols are designed for resourceful devices that are interconnected via the Internet. Moreover, existing distributed ledgers are not deployable to D2D ecosystems since their storage needs are continuously increasing. In this work, we introduce and analyse Mneme, a DAG-based distributed ledger that can be maintained solely by mobile devices. Mneme utilizes two novel consensus protocols: Proof-of-Context (PoC) and Proof-of-Equivalence (PoE). PoC employs users context to add data on Mneme. PoE is executed periodically to summarize data and produce equivalent blocks that require less storage. We analyze Mnemes security and justify the ability of PoC and PoE to guarantee the characteristics of distributed ledgers: persistence and liveness. Furthermore, we analyze potential attacks from malicious users and prove that the probability of a successful attack is inversely proportional to the square of the number of mobile users who maintain Mneme.
We review probabilistic models known as majority dynamics (also known as threshold Voter Models) and discuss their possible applications for achieving consensus in cryptocurrency systems. In particular, we show that using this approach straightforwardly for practical consensus in Byzantine setting can be problematic and requires extensive further research. We then discuss the FPC consensus protocol which circumvents the problems mentioned above by using external randomness.
In public distributed ledger technologies (DLTs), such as Blockchains, nodes can join and leave the network at any time. A major challenge occurs when a new node joining the network wants to retrieve the current state of the ledger. Indeed, that node may receive conflicting information from honest and Byzantine nodes, making it difficult to identify the current state. In this paper, we are interested in protocols that are stateless, i.e., a new joining node should be able to retrieve the current state of the ledger just using a fixed amount of data that characterizes the ledger (such as the genesis block in Bitcoin). We define three variants of stateless DLTs: weak, strong, and probabilistic. Then, we analyze this property for DLTs using different types of consensus.
Distributed Ledger Technologies provide a mechanism to achieve ordering among transactions that are scattered on multiple participants with no prerequisite trust relations. This mechanism is essentially based on the idea of new transactions referencing older ones in a chain structure. Recently, DAG-type Distributed Ledgers that are based on directed acyclic graphs (DAGs) were proposed to increase the system scalability through sacrificing the total order of transactions. In this paper, we develop a mathematical model to study the process that governs the addition of new transactions to the DAG-type Distributed Ledger. We propose a simple model for DAG-type Distributed Ledgers that are obtained from a recursive Young-age Preferential Attachment scheme, i.e. new connections are made preferably to transactions that have not been in the system for very long. We determine the asymptotic degree structure of the resulting graph and show that a forward component of linear size arises if the edge density is chosen sufficiently large in relation to the `young-age preference that tunes how quickly old transactions become unattractive.
For data-centric systems, provenance tracking is particularly important when the system is open and decentralised, such as the Web of Linked Data. In this paper, a concise but expressive calculus which models data updates is presented. The calculus is used to provide an operational semantics for a system where data and updates interact concurrently. The operational semantics of the calculus also tracks the provenance of data with respect to updates. This provides a new formal semantics extending provenance diagrams which takes into account the execution of processes in a concurrent setting. Moreover, a sound and complete model for the calculus based on ideals of series-parallel DAGs is provided. The notion of provenance introduced can be used as a subjective indicator of the quality of data in concurrent interacting systems.