ﻻ يوجد ملخص باللغة العربية
Emerging data analysis involves the ingestion and exploration of new data sets, application of complex functions, and frequent query revisions based on observing prior query answers. We call this new type of analysis evolutionary analytics and identify its properties. This type of analysis is not well represented by current benchmark workloads. In this paper, we present a workload and identify several metrics to test system support for evolutionary analytics. Along with our metrics, we present methodologies for running the workload that capture this analytical scenario.
Persistent partitioning is effective in avoiding expensive shuffling operations. However it remains a significant challenge to automate this process for Big Data analytics workloads that extensively use user defined functions (UDFs), where sub-comput
The need for modern data analytics to combine relational, procedural, and map-reduce-style functional processing is widely recognized. State-of-the-art systems like Spark have added SQL front-ends and relational query optimization, which promise an i
In recent years, the size of big linked data has grown rapidly and this number is still rising. Big linked data and knowledge bases come from different domains such as life sciences, publications, media, social web, and so on. However, with the rapid
The growing size of data center and HPC networks pose unprecedented requirements on the scalability of simulation infrastructure. The ability to simulate such large-scale interconnects on a simple PC would facilitate research efforts. Unfortunately,
We propose a new mechanism to accurately answer a user-provided set of linear counting queries under local differential privacy (LDP). Given a set of linear counting queries (the workload) our mechanism automatically adapts to provide accuracy on the