ﻻ يوجد ملخص باللغة العربية
A sketch is a probabilistic data structure used to record frequencies of items in a multi-set. Sketches are widely used in various fields, especially those that involve processing and storing data streams. In streaming applications with high data rates, a sketch fills up very quickly. Thus, its contents are periodically transferred to the remote collector, which is responsible for answering queries. In this paper, we propose a new sketch, called Slim-Fat (SF) sketch, which has a significantly higher accuracy compared to prior art, a much smaller memory footprint, and at the same time achieves the same speed as the best prior sketch. The key idea behind our proposed SF-sketch is to maintain two separate sketches: a small sketch called Slim-subsketch and a large sketch called Fat-subsketch. The Slim-subsketch is periodically transferred to the remote collector for answering queries quickly and accurately. The Fat-subsketch, however, is not transferred to the remote collector because it is used only to assist the Slim-subsketch during the insertions and deletions and is not used to answer queries. We implemented and extensively evaluated SF-sketch along with several prior sketches and compared them side by side. Our experimental results show that SF-sketch outperforms the most widely used CM-sketch by up to 33.1 times in terms of accuracy. We have released the source codes of our proposed sketch as well as existing sketches at Github. The short version of this paper will appear in ICDE 2017.
Learning data storytelling involves a complex web of skills. Professional and academic educational offerings typically focus on the computational literacies required, but professionals in the field employ many non-technical methods; sketching by hand
Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases, machine learning, and information retrieval. MinHash is a well-known technique for approximating Jaccard similarity of sets and has been
Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges and subgraphs in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? For example, in intrusion detection, exist
We propose an interactive GAN-based sketch-to-image translation method that helps novice users create images of simple objects. As the user starts to draw a sketch of a desired object type, the network interactively recommends plausible completions,
Sketching or doodling is a popular creative activity that people engage in. However, most existing work in automatic sketch understanding or generation has focused on sketches that are quite mundane. In this work, we introduce two datasets of creativ