مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Linked Open Data Validity -- A Technical Report from ISWS 2018

194 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Valentina Presutti

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Tayeb Abderrahmani Ghor - Esha Agrawal - Mehwish Alam

قواعد البيانات أجهزة الكمبيوتر والمجتمع

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Linked Open Data (LOD) is the publicly available RDF data in the Web. Each LOD entity is identfied by a URI and accessible via HTTP. LOD encodes globalscale knowledge potentially available to any human as well as artificial intelligence that may want to benefit from it as background knowledge for supporting their tasks. LOD has emerged as the backbone of applications in diverse fields such as Natural Language Processing, Information Retrieval, Computer Vision, Speech Recognition, and many more. Nevertheless, regardless of the specific tasks that LOD-based tools aim to address, the reuse of such knowledge may be challenging for diverse reasons, e.g. semantic heterogeneity, provenance, and data quality. As aptly stated by Heath et al. Linked Data might be outdated, imprecise, or simply wrong: there arouses a necessity to investigate the problem of linked data validity. This work reports a collaborative effort performed by nine teams of students, guided by an equal number of senior researchers, attending the International Semantic Web Research School (ISWS 2018) towards addressing such investigation from different perspectives coupled with different approaches to tackle the issue.

قيم البحث

93 - Nacira Abbas , Kholoud Alghamdi , Mortaza Alinam 2020

One of the grand challenges discussed during the Dagstuhl Seminar Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web and described in its report is that of a: Public FAIR Knowledge Graph of Everything: We increasingly s ee the creation of knowledge graphs that capture information about the entirety of a class of entities. [...] This grand challenge extends this further by asking if we can create a knowledge graph of everything ranging from common sense concepts to location based entities. This knowledge graph should be open to the public in a FAIR manner democratizing this mass amount of knowledge. Although linked open data (LOD) is one knowledge graph, it is the closest realisation (and probably the only one) to a public FAIR Knowledge Graph (KG) of everything. Surely, LOD provides a unique testbed for experimenting and evaluating research hypotheses on open and FAIR KG. One of the most neglected FAIR issues about KGs is their ongoing evolution and long term preservation. We want to investigate this problem, that is to understand what preserving and supporting the evolution of KGs means and how these problems can be addressed. Clearly, the problem can be approached from different perspectives and may require the development of different approaches, including new theories, ontologies, metrics, strategies, procedures, etc. This document reports a collaborative effort performed by 9 teams of students, each guided by a senior researcher as their mentor, attending the International Semantic Web Research School (ISWS 2019). Each team provides a different perspective to the problem of knowledge graph evolution substantiated by a set of research questions as the main subject of their investigation. In addition, they provide their working definition for KG preservation and evolution.

الذكاء الاصطناعي

Temporal Graph Functional Dependencies -- Technical Report

165 - Morteza Alipourlangouri , Adam Mansfield , Fei Chiang 2021

We propose a class of functional dependencies for temporal graphs, called TGFDs. TGFDs capture both attribute-value dependencies and topological structures of entities over a valid period of time in a temporal graph. It subsumes graph functional depe ndencies (gfds) and conditional functional dependencies (CFDs) as a special case. We study the foundations of TGFDs including satisfiability, implication and validation. We show that the satisfiability and validation problems are coNP-complete and the implication problem is NP-complete. We also present an axiomatization of TGFDs and provide the proof of the soundness and completeness of the axiomatization.

قواعد البيانات

VSS: A Storage System for Video Analytics [Technical Report]

401 - Brandon Haynes , Maureen Daum , Dong He 2021

We present a new video storage system (VSS) designed to decouple high-level video operations from the low-level details required to store and efficiently retrieve video data. VSS is designed to be the storage subsystem of a video data management syst em (VDBMS) and is responsible for: (1) transparently and automatically arranging the data on disk in an efficient, granular format; (2) caching frequently-retrieved regions in the most useful formats; and (3) eliminating redundancies found in videos captured from multiple cameras with overlapping fields of view. Our results suggest that VSS can improve VDBMS read performance by up to 54%, reduce storage costs by up to 45%, and enable developers to focus on application logic rather than video storage and retrieval.

قواعد البيانات

The Workflow Trace Archive: Open-Access Data from Public and Private Computing Infrastructures -- Technical Report

109 - Laurens Versluis , Roland Matha , Sacheendra Talluri 2019

Realistic, relevant, and reproducible experiments often need input traces collected from real-world environments. We focus in this work on traces of workflows---common in datacenters, clouds, and HPC infrastructures. We show that the state-of-the-art in using workflow-traces raises important issues: (1) the use of realistic traces is infrequent, and (2) the use of realistic, {it open-access} traces even more so. Alleviating these issues, we introduce the Workflow Trace Archive (WTA), an open-access archive of workflow traces from diverse computing infrastructures and tooling to parse, validate, and analyze traces. The WTA includes ${>}48$ million workflows captured from ${>}10$ computing infrastructures, representing a broad diversity of trace domains and characteristics. To emphasize the importance of trace diversity, we characterize the WTA contents and analyze in simulation the impact of trace diversity on experiment results. Our results indicate significant differences in characteristics, properties, and workflow structures between workload sources, domains, and fields.

النظم الموزعة والتوازية والحوسبة العنقودية

CHEF: A Cheap and Fast Pipeline for Iteratively Cleaning Label Uncertainties (Technical Report)

59 - Yinjun Wu , James Weimer , Susan B. Davidson 2021

High-quality labels are expensive to obtain for many machine learning tasks, such as medical image classification tasks. Therefore, probabilistic (weak) labels produced by weak supervision tools are used to seed a process in which influential samples with weak labels are identified and cleaned by several human annotators to improve the model performance. To lower the overall cost and computational overhead of this process, we propose a solution called CHEF (CHEap and Fast label cleaning), which consists of the following three components. First, to reduce the cost of human annotators, we use Infl, which prioritizes the most influential training samples for cleaning and provides cleaned labels to save the cost of one human annotator. Second, to accelerate the sample selector phase and the model constructor phase, we use Increm-Infl to incrementally produce influential samples, and DeltaGrad-L to incrementally update the model. Third, we redesign the typical label cleaning pipeline so that human annotators iteratively clean smaller batch of samples rather than one big batch of samples. This yields better over all model performance and enables possible early termination when the expected model performance has been achieved. Extensive experiments show that our approach gives good model prediction performance while achieving significant speed-ups.

قواعد البيانات التعلم الآلي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد الوطني لإدارة الأعمال

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Linked Open Data Validity -- A Technical Report from ISWS 2018

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً