ترغب بنشر مسار تعليمي؟ اضغط هنا

Top-k Spatial-keyword Publish/Subscribe Over Sliding Window

185   0   0.0 ( 0 )
 نشر من قبل Ying Zhang Dr.
 تاريخ النشر 2016
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

With the prevalence of social media and GPS-enabled devices, a massive amount of geo-textual data has been generated in a stream fashion, leading to a variety of applications such as location-based recommendation and information dissemination. In this paper, we investigate a novel real-time top-k monitoring problem over sliding window of streaming data; that is, we continuously maintain the top-k most relevant geo-textual messages (e.g., geo-tagged tweets) for a large number of spatial-keyword subscriptions (e.g., registered users interested in local events) simultaneously. To provide the most recent information under controllable memory cost, sliding window model is employed on the streaming geo-textual data. To the best of our knowledge, this is the first work to study top-k spatial-keyword publish/subscribe over sliding window. A novel centralized system, called Skype (Topk Spatial-keyword Publish/Subscribe), is proposed in this paper. In Skype, to continuously maintain top-k results for massive subscriptions, we devise a novel indexing structure upon subscriptions such that each incoming message can be immediately delivered on its arrival. To reduce the expensive top-k re-evaluation cost triggered by message expiration, we develop a novel cost-based k-skyband technique to reduce the number of re-evaluations in a cost-effective way. Extensive experiments verify the great efficiency and effectiveness of our proposed techniques. Furthermore, to support better scalability and higher throughput, we propose a distributed version of Skype, namely, DSkype, on top of Storm, which is a popular distributed stream processing system. With the help of fine-tuned subscription/message distribution mechanisms, DSkype can achieve orders of magnitude speed-up than its centralized version.



قيم البحث

اقرأ أيضاً

110 - Sayan Ranu , Ambuj K. Singh 2011
In this paper, we formulate a top-k query that compares objects in a database to a user-provided query object on a novel scoring function. The proposed scoring function combines the idea of attractive and repulsive dimensions into a general framework to overcome the weakness of traditional distance or similarity measures. We study the properties of the proposed class of scoring functions and develop efficient and scalable index structures that index the isolines of the function. We demonstrate various scenarios where the query finds application. Empirical evaluation demonstrates a performance gain of one to two orders of magnitude on querying time over existing state-of-the-art top-k techniques. Further, a qualitative analysis is performed on a real dataset to highlight the potential of the proposed query in discovering hidden data characteristics.
Top-k query processing finds a list of k results that have largest scores w.r.t the user given query, with the assumption that all the k results are independent to each other. In practice, some of the top-k results returned can be very similar to eac h other. As a result some of the top-k results returned are redundant. In the literature, diversified top-k search has been studied to return k results that take both score and diversity into consideration. Most existing solutions on diversified top-k search assume that scores of all the search results are given, and some works solve the diversity problem on a specific problem and can hardly be extended to general cases. In this paper, we study the diversified top-k search problem. We define a general diversified top-k search problem that only considers the similarity of the search results themselves. We propose a framework, such that most existing solutions for top-k query processing can be extended easily to handle diversified top-k search, by simply applying three new functions, a sufficient stop condition sufficient(), a necessary stop condition necessary(), and an algorithm for diversified top-k search on the current set of generated results, div-search-current(). We propose three new algorithms, namely, div-astar, div-dp, and div-cut to solve the div-search-current() problem. div-astar is an A* based algorithm, div-dp is an algorithm that decomposes the results into components which are searched using div-astar independently and combined using dynamic programming. div-cut further decomposes the current set of generated results using cut points and combines the results using sophisticated operations. We conducted extensive performance studies using two real datasets, enwiki and reuters. Our div-cut algorithm finds the optimal solution for diversified top-k search problem in seconds even for k as large as 2,000.
Contemporary high-performance service-oriented applications demand a performance efficient run-time monitoring. In this paper, we analyze a hierarchical publish-subscribe architecture for monitoring service-oriented applications. The analyzed archite cture is based on a tree topology and publish-subscribe communication model for aggregation of distributed monitoring data. In order to satisfy interoperability and platform independence of service-orientation, monitoring reports are represented as XML documents. Since XML formatting introduces a significant processing and network load, we analyze the performance of monitoring architecture with respect to the number of monitored nodes, the load of system machines, and the overall latency of the monitoring system.
This paper revisits NDN deployment in the IoT with a special focus on the interaction of sensors and actuators. Such scenarios require high responsiveness and limited control state at the constrained nodes. We argue that the NDN request-response patt ern which prevents data push is vital for IoT networks. We contribute HoP-and-Pull (HoPP), a robust publish-subscribe scheme for typical IoT scenarios that targets IoT networks consisting of hundreds of resource constrained devices at intermittent connectivity. Our approach limits the FIB tables to a minimum and naturally supports mobility, temporary network partitioning, data aggregation and near real-time reactivity. We experimentally evaluate the protocol in a real-world deployment using the IoT-Lab testbed with varying numbers of constrained devices, each wirelessly interconnected via IEEE 802.15.4 LowPANs. Implementations are built on CCN-lite with RIOT and support experiments using various single- and multi-hop scenarios.
Betweenness centrality, measured by the number of times a vertex occurs on all shortest paths of a graph, has been recognized as a key indicator for the importance of a vertex in the network. However, the betweenness of a vertex is often very hard to compute because it needs to explore all the shortest paths between the other vertices. Recently, a relaxed concept called ego-betweenness was introduced which focuses on computing the betweenness of a vertex in its ego network. In this work, we study a problem of finding the top-k vertices with the highest ego-betweennesses. We first develop two novel search algorithms equipped with a basic upper bound and a dynamic upper bound to efficiently solve this problem. Then, we propose local-update and lazy-update solutions to maintain the ego-betweennesses for all vertices and the top-k results when the graph is updated, respectively. In addition, we also present two efficient parallel algorithms to further improve the efficiency. The results of extensive experiments on five large real-life datasets demonstrate the efficiency, scalability, and effectiveness of our algorithms.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا