ﻻ يوجد ملخص باللغة العربية
Data collaboration activities typically require systematic or protocol-based coordination to be scalable. Git, an effective enabler for collaborative coding, has been attested for its success in countless projects around the world. Hence, applying the Git philosophy to general data collaboration beyond coding is motivating. We call it Git for data. However, the original Git design handles data at the file granule, which is considered too coarse-grained for many database applications. We argue that Git for data should be co-designed with database systems. To this end, we developed ForkBase to make Git for data practical. ForkBase is a distributed, immutable storage system designed for data version management and data collaborative operation. In this demonstration, we show how ForkBase can greatly facilitate collaborative data management and how its novel data deduplication technique can improve storage efficiency for archiving massive da
Existing data storage systems offer a wide range of functionalities to accommodate an equally diverse range of applications. However, new classes of applications have emerged, e.g., blockchain and collaborative analytics, featuring data versioning, f
In emerging applications such as blockchains and collaborative data analytics, there are strong demands for data immutability, multi-version accesses, and tamper-evident controls. This leads to three new index structures for immutable data, namely Me
Modern video data management systems store videos as a single encoded file, which significantly limits possible storage level optimizations. We design, implement, and evaluate TASM, a new tile-based storage manager for video data. TASM uses a feature
We present a new video storage system (VSS) designed to decouple high-level video operations from the low-level details required to store and efficiently retrieve video data. VSS is designed to be the storage subsystem of a video data management syst
Two-phase commit (2PC) has been widely used in distributed databases to ensure atomicity for distributed transactions. However, 2PC suffers from two limitations. First, 2PC incurs long latency as it requires two logging operations on the critical pat