ترغب بنشر مسار تعليمي؟ اضغط هنا

Revealing Secrets in SPARQL Session Level

74   0   0.0 ( 0 )
 نشر من قبل Meng Wang
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Based on Semantic Web technologies, knowledge graphs help users to discover information of interest by using live SPARQL services. Answer-seekers often examine intermediate results iteratively and modify SPARQL queries repeatedly in a search session. In this context, understanding user behaviors is critical for effective intention prediction and query optimization. However, these behaviors have not yet been researched systematically at the SPARQL session level. This paper reveals secrets of session-level user search behaviors by conducting a comprehensive investigation over massive real-world SPARQL query logs. In particular, we thoroughly assess query changes made by users w.r.t. structural and data-driven features of SPARQL queries. To illustrate the potentiality of our findings, we employ an application example of how to use our findings, which might be valuable to devise efficient SPARQL caching, auto-completion, query suggestion, approximation, and relaxation techniques in the future.



قيم البحث

اقرأ أيضاً

In this paper, we present an embedding-based framework (TrQuery) for recommending solutions of a SPARQL query, including approximate solutions when exact querying solutions are not available due to incompleteness or inconsistencies of real-world RDF data. Within this framework, embedding is applied to score solutions together with edit distance so that we could obtain more fine-grained recommendations than those recommendations via edit distance. For instance, graphs of two querying solutions with a similar structure can be distinguished in our proposed framework while the edit distance depending on structural difference becomes unable. To this end, we propose a novel score model built on vector space generated in embedding system to compute the similarity between an approximate subgraph matching and a whole graph matching. Finally, we evaluate our approach on large RDF datasets DBpedia and YAGO, and experimental results show that TrQuery exhibits an excellent behavior in terms of both effectiveness and efficiency.
Query response time often influences user experience in the real world. However, it possibly takes more time to answer a query with its all exact solutions, especially when it contains the OPT operations since the OPT operation is the least conventio nal operator in SPARQL. So it becomes essential to make a trade-off between the query response time and the accuracy of their solutions. In this paper, based on the depth of the OPT operation occurring in a query, we propose an approach to obtain its all approximate queries with less depth of the OPT operation. This paper mainly discusses those queries with well-designed patterns since the OPT operation in a well-designed pattern is really optional. Firstly, we transform a well-designed pattern in OPT normal form into a well-designed tree, whose inner nodes are labeled by OPT operation and leaf nodes are labeled by patterns containing other operations such as the AND operation and the FILTER operation. Secondly, based on this well-designed tree, we remove optional well-designed subtrees with less depth of the OPT operation and then obtain approximate queries with different depths of the OPT operation. Finally, we evaluate the approximate query efficiency with the degree of approximation.
In the real world datasets (e.g.,DBpedia query log), queries built on well-designed patterns containing only AND and OPT operators (for short, WDAO-patterns) account for a large proportion among all SPARQL queries. In this paper, we present a plugin- based framework for all SELECT queries built on WDAO-patterns, named PIWD. The framework is based on a parse tree called emph{well-designed AND-OPT tree} (for short, WDAO-tree) whose leaves are basic graph patterns (BGP) and inner nodes are the OPT operators. We prove that for any WDAO-pattern, its parse tree can be equivalently transformed into a WDAO-tree. Based on the proposed framework, we can employ any query engine to evaluate BGP for evaluating queries built on WDAO-patterns in a convenient way. Theoretically, we can reduce the query evaluation of WDAO-patterns to subgraph homomorphism as well as BGP since the query evaluation of BGP is equivalent to subgraph homomorphism. Finally, our preliminary experiments on gStore and RDF-3X show that PIWD can answer all queries built on WDAO-patterns effectively and efficiently.
In this paper, we present a MapReduce-based framework for evaluating SPARQL queries on GPU (named MapSQ) to large-scale RDF datesets efficiently by applying both high performance. Firstly, we develop a MapReduce-based Join algorithm to handle SPARQL queries in a parallel way. Secondly, we present a coprocessing strategy to manage the process of evaluating queries where CPU is used to assigns subqueries and GPU is used to compute the join of subqueries. Finally, we implement our proposed framework and evaluate our proposal by comparing with two popular and latest SPARQL query engines gStore and gStoreD on the LUBM benchmark. The experiments demonstrate that our proposal MapSQ is highly efficient and effective (up to 50% speedup).
Resource Description Framework (RDF) has been widely used to represent information on the web, while SPARQL is a standard query language to manipulate RDF data. Given a SPARQL query, there often exist many joins which are the bottlenecks of efficienc y of query processing. Besides, the real RDF datasets often reveal strong data sparsity, which indicates that a resource often only relates to a few resources even the number of total resources is large. In this paper, we propose a sparse matrix-based (SM-based) SPARQL query processing approach over RDF datasets which con- siders both join optimization and data sparsity. Firstly, we present a SM-based storage for RDF datasets to lift the storage efficiency, where valid edges are stored only, and then introduce a predicate- based hash index on the storage. Secondly, we develop a scalable SM-based join algorithm for SPARQL query processing. Finally, we analyze the overall cost by accumulating all intermediate results and design a query plan generated algorithm. Besides, we extend our SM-based join algorithm on GPU for parallelizing SPARQL query processing. We have evaluated our approach compared with the state-of-the-art RDF engines over benchmark RDF datasets and the experimental results show that our proposal can significantly improve SPARQL query processing with high scalability.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا