أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Hazel Webb

419 - Hazel Webb , Daniel Lemire , Owen Kaser 2010

In OLAP, analysts often select an interesting sample of the data. For example, an analyst might focus on products bringing revenues of at least 100 000 dollars, or on shops having sales greater than 400 000 dollars. However, current systems do not al low the application of both of these thresholds simultaneously, selecting products and shops satisfying both thresholds. For such purposes, we introduce the diamond cube operator, filling a gap among existing data warehouse operations. Because of the interaction between dimensions the computation of diamond cubes is challenging. We compare and test various algorithms on large data sets of more than 100 million facts. We find that while it is possible to implement diamonds in SQL, it is inefficient. Indeed, our custom implementation can be a hundred times faster than popular database engines (including a row-store and a column-store).

قواعد البيانات

Pruning Attribute Values From Data Cubes with Diamond Dicing

98 - Hazel Webb , Owen Kaser , Daniel Lemire 2008

Data stored in a data warehouse are inherently multidimensional, but most data-pruning techniques (such as iceberg and top-k queries) are unidimensional. However, analysts need to issue multidimensional queries. For example, an analyst may need to se lect not just the most profitable stores or--separately--the most profitable products, but simultaneous sets of stores and products fulfilling some profitability constraints. To fill this need, we propose a new operator, the diamond dice. Because of the interaction between dimensions, the computation of diamonds is challenging. We present the first diamond-dicing experiments on large data sets. Experiments show that we can compute diamond cubes over fact tables containing 100 million facts in less than 35 minutes using a standard PC.

قواعد البيانات بنى وهياكل البيانات والخوارزميات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد