ﻻ يوجد ملخص باللغة العربية
Processing and analyzing time series data-sets have become a central issue in many domains requiring data management systems to support time series as a native data type. A crucial prerequisite of these systems is time series matching, which still is a challenging problem. A time series is a high-dimensional data type, its representation is storage-, and its comparison is time-consuming. Among the representation techniques that tackle these challenges, the symbolic aggregate approximation (SAX) is the current state of the art. This technique reduces a time series to a low-dimensional space by segmenting it and discretizing each segment into a small symbolic alphabet. However, SAX ignores the deterministic behavior of time series such as cyclical repeating patterns or trend component affecting all segments and leading to a distortion of the symbolic distribution. In this paper, we present a season- and a trend-aware symbolic approximation. We show that this improves the symbolic distribution and increase the representation accuracy without increasing its memory footprint. Most importantly, this enables a more efficient time series matching by providing a match up to three orders of magnitude faster than SAX.
Parallel aggregation is a ubiquitous operation in data analytics that is expressed as GROUP BY in SQL, reduce in Hadoop, or segment in TensorFlow. Parallel aggregation starts with an optional local pre-aggregation step and then repartitions the inter
Bitvector filtering is an important query processing technique that can significantly reduce the cost of execution, especially for complex decision support queries with multiple joins. Despite its wide application, however, its implication to query o
We study a conservative extension of classical propositional logic distinguishing between four modes of statement: a proposition may be affirmed or denied, and it may be strong or classical. Proofs of strong propositions must be constructive in some
Why and why-not provenance have been studied extensively in recent years. However, why-not provenance, and to a lesser degree why provenance, can be very large resulting in severe scalability and usability challenges. In this paper, we introduce a no
Using the growing volumes of vehicle trajectory data, it becomes increasingly possible to capture time-varying and uncertain travel costs in a road network, including travel time and fuel consumption. The current paradigm represents a road network as