No Arabic abstract
A major algorithmic challenge in designing applications intended for secure remote execution is ensuring that they are oblivious to their inputs, in the sense that their memory access patterns do not leak sensitive information to the server. This problem is particularly relevant to cloud databases that wish to allow queries over the clients encrypted data. One of the major obstacles to such a goal is the join operator, which is non-trivial to implement obliviously without resorting to generic but inefficient solutions like Oblivious RAM (ORAM). We present an oblivious algorithm for equi-joins which (up to a logarithmic factor) matches the optimal $O(nlog n)$ complexity of the standard non-secure sort-merge join (on inputs producing $O(n)$ outputs). We do not use use expensive primitives like ORAM or rely on unrealistic hardware or security assumptions. Our approach, which is based on sorting networks and novel provably-oblivious constructions, is conceptually simple, easily verifiable, and very efficient in practice. Its data-independent algorithmic structure makes it secure in various different settings for remote computation, even in those that are known to be vulnerable to certain side-channel attacks (such as Intel SGX) or with strict requirements for low circuit complexity (like secure multiparty computation). We confirm that our approach is easily realizable through a compact implementation which matches our expectations for performance and is shown, both formally and empirically, to possess the desired security characteristics.
Local sensitivity of a query Q given a database instance D, i.e. how much the output Q(D) changes when a tuple is added to D or deleted from D, has many applications including query analysis, outlier detection, and in differential privacy. However, it is NP-hard to find local sensitivity of a conjunctive query in terms of the size of the query, even for the class of acyclic queries. Although the complexity is polynomial when the query size is fixed, the naive algorithms are not efficient for large databases and queries involving multiple joins. In this paper, we present a novel approach to compute local sensitivity of counting queries involving join operations by tracking and summarizing tuple sensitivities -- the maximum change a tuple can cause in the query result when it is added or removed. We give algorithms for the sensitivity problem for full acyclic join queries using join trees, that run in polynomial time in both the size of the database and query for an interesting sub-class of queries, which we call doubly acyclic queries that include path queries, and in polynomial time in combined complexity when the maximum degree in the join tree is bounded. Our algorithms can be extended to certain non-acyclic queries using generalized hypertree decompositions. We evaluate our approach experimentally, and show applications of our algorithms to obtain better results for differential privacy by orders of magnitude.
We study the problem of computing similarity joins under edit distance on a set of strings. Edit similarity joins is a fundamental problem in databases, data mining and bioinformatics. It finds important applications in data cleaning and integration, collaborative filtering, genome sequence assembly, etc. This problem has attracted significant attention in the past two decades. However, all previous algorithms either cannot scale well to long strings and large similarity thresholds, or suffer from imperfect accuracy. In this paper we propose a new algorithm for edit similarity joins using a novel string partition based approach. We show mathematically that with high probability our algorithm achieves a perfect accuracy, and runs in linear time plus a data-dependent verification step. Experiments on real world datasets show that our algorithm significantly outperforms the state-of-the-art algorithms for edit similarity joins, and achieves perfect accuracy on all the datasets that we have tested.
Databases in the past have helped businesses maintain and extract insights from their data. Today, it is common for a business to involve multiple independent, distrustful parties. This trend towards decentralization introduces a new and important requirement to databases: the integrity of the data, the history, and the execution must be protected. In other words, there is a need for a new class of database systems whose integrity can be verified (or verifiable databases). In this paper, we identify the requirements and the design challenges of verifiable databases.We observe that the main challenges come from the need to balance data immutability, tamper evidence, and performance. We first consider approaches that extend existing OLTP and OLAP systems with support for verification. We next examine a clean-slate approach, by describing a new system, Spitz, specifically designed for efficiently supporting immutable and tamper-evident transaction management. We conduct a preliminary performance study of both approaches against a baseline system, and provide insights on their performance.
In the current world of economic crises, the cost control is one of the chief concerns for all types of industries, especially for the small venders. The small vendors are suppose to minimize their budget on Information Technology by reducing the initial investment in hardware and costly database servers like ORACLE, SQL Server, SYBASE, etc. for the purpose of data processing and storing. In other divisions, the electronic devices manufacturing companies want to increase the demand and reduce the manufacturing cost by introducing the low cost technologies. The new small devices like ipods, iphones, palm top etc. are now-a-days used as data computation and storing tools. For both the cases mentioned above, instead of going for the costly database servers which additionally requires extra hardware as well as the extra expenses in training and handling, the flat file may be considered as a candidate due to its easy handling nature, fast accessing, and of course free of cost. But the main hurdle is the security aspects which are not up to the optimum level. In this paper, we propose a methodology that combines all the merit of the flat file and with the help of a novel steganographic technique we can maintain the utmost security fence. The new proposed methodology will undoubtedly be highly beneficial for small vendors as well as for the above said electronic devices manufacturer
Digital multimedia watermarking technology was suggested in the last decade to embed copyright information in digital objects such images, audio and video. However, the increasing use of relational database systems in many real-life applications created an ever increasing need for watermarking database systems. As a result, watermarking relational database systems is now merging as a research area that deals with the legal issue of copyright protection of database systems. Approach: In this study, we proposed an efficient database watermarking algorithm based on inserting binary image watermarks in non-numeric mutli-word attributes of selected database tuples. Results: The algorithm is robust as it resists attempts to remove or degrade the embedded watermark and it is blind as it does not require the original database in order to extract the embedded watermark. Conclusion: Experimental results demonstrated blindness and the robustness of the algorithm against common database attacks.