بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Temporal Graph Functional Dependencies -- Technical Report

166 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Morteza Alipourlangouri

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Morteza Alipourlangouri - Adam Mansfield - Fei Chiang

قواعد البيانات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We propose a class of functional dependencies for temporal graphs, called TGFDs. TGFDs capture both attribute-value dependencies and topological structures of entities over a valid period of time in a temporal graph. It subsumes graph functional dependencies (gfds) and conditional functional dependencies (CFDs) as a special case. We study the foundations of TGFDs including satisfiability, implication and validation. We show that the satisfiability and validation problems are coNP-complete and the implication problem is NP-complete. We also present an axiomatization of TGFDs and provide the proof of the soundness and completeness of the axiomatization.

قيم البحث

اقرأ أيضاً

GGDs: Graph Generating Dependencies

77 - Larissa C. Shimomura 2020

We propose Graph Generating Dependencies (GGDs), a new class of dependencies for property graphs. Extending the expressivity of state of the art constraint languages, GGDs can express both tuple- and equality-generating dependencies on property graph s, both of which find broad application in graph data management. We provide the formal definition of GGDs, analyze the validation problem for GGDs, and demonstrate the practical utility of GGDs.

قواعد البيانات

VSS: A Storage System for Video Analytics [Technical Report]

401 - Brandon Haynes , Maureen Daum , Dong He 2021

We present a new video storage system (VSS) designed to decouple high-level video operations from the low-level details required to store and efficiently retrieve video data. VSS is designed to be the storage subsystem of a video data management syst em (VDBMS) and is responsible for: (1) transparently and automatically arranging the data on disk in an efficient, granular format; (2) caching frequently-retrieved regions in the most useful formats; and (3) eliminating redundancies found in videos captured from multiple cameras with overlapping fields of view. Our results suggest that VSS can improve VDBMS read performance by up to 54%, reduce storage costs by up to 45%, and enable developers to focus on application logic rather than video storage and retrieval.

قواعد البيانات

Discovery and Contextual Data Cleaning with Ontology Functional Dependencies

336 - Zheng Zheng , Longtao Zheng , Morteza Alipour Langouri 2021

Functional Dependencies (FDs) define attribute relationships based on syntactic equality, and, when usedin data cleaning, they erroneously label syntactically different but semantically equivalent values as errors. We explore dependency-based data cl eaning with Ontology Functional Dependencies(OFDs), which express semantic attribute relationships such as synonyms and is-a hierarchies defined by an ontology. We study the theoretical foundations for OFDs, including sound and complete axioms and a linear-time inference procedure. We then propose an algorithm for discovering OFDs (exact ones and ones that hold with some exceptions) from data that uses the axioms to prune the search space. Towards enabling OFDs as data quality rules in practice, we study the problem of finding minimal repairs to a relation and ontology with respect to a set of OFDs. We demonstrate the effectiveness of our techniques on real datasets, and show that OFDs can significantly reduce the number of false positive errors in data cleaning techniques that rely on traditional FDs.

قواعد البيانات

Linked Open Data Validity -- A Technical Report from ISWS 2018

193 - Tayeb Abderrahmani Ghor , Esha Agrawal , Mehwish Alam 2019

Linked Open Data (LOD) is the publicly available RDF data in the Web. Each LOD entity is identfied by a URI and accessible via HTTP. LOD encodes globalscale knowledge potentially available to any human as well as artificial intelligence that may want to benefit from it as background knowledge for supporting their tasks. LOD has emerged as the backbone of applications in diverse fields such as Natural Language Processing, Information Retrieval, Computer Vision, Speech Recognition, and many more. Nevertheless, regardless of the specific tasks that LOD-based tools aim to address, the reuse of such knowledge may be challenging for diverse reasons, e.g. semantic heterogeneity, provenance, and data quality. As aptly stated by Heath et al. Linked Data might be outdated, imprecise, or simply wrong: there arouses a necessity to investigate the problem of linked data validity. This work reports a collaborative effort performed by nine teams of students, guided by an equal number of senior researchers, attending the International Semantic Web Research School (ISWS 2018) towards addressing such investigation from different perspectives coupled with different approaches to tackle the issue.

قواعد البيانات أجهزة الكمبيوتر والمجتمع

Technical Report: Optimizing Human Involvement for Entity Matching and Consolidation

194 - Ji Sun , Dong Deng , Ihab Ilyas 2019

An end-to-end data integration system requires human feedback in several phases, including collecting training data for entity matching, debugging the resulting clusters, confirming transformations applied on these clusters for data standardization, and finally, reducing each cluster to a single, canonical representation (or golden record). The traditional wisdom is to sequentially apply the human feedback, obtained by asking specific questions, within some budget in each phase. However, these questions are highly correlated; the answer to one can influence the outcome of any of the phases of the pipeline. Hence, interleaving them has the potential to offer significant benefits. In this paper, we propose a human-in-the-loop framework that interleaves different types of questions to optimize human involvement. We propose benefit models to measure the quality improvement from asking a question, and cost models to measure the human time it takes to answer a question. We develop a question scheduling framework that judiciously selects questions to maximize the accuracy of the final golden records. Experimental results on three real-world datasets show that our holistic method significantly improves the quality of golden records from 70% to 90%, compared with the state-of-the-art approaches.

قواعد البيانات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة الإسلامية في لبنان

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Temporal Graph Functional Dependencies -- Technical Report

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً