بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

A Simple Mechanism for Focused Web-harvesting

424 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل L.T. Handoko

تاريخ النشر 2008

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Z. Akbar - L.T. Handoko

استرجاع المعلومات أجهزة الكمبيوتر والمجتمع

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The focused web-harvesting is deployed to realize an automated and comprehensive index databases as an alternative way for virtual topical data integration. The web-harvesting has been implemented and extended by not only specifying the targeted URLs, but also predefining human-edited harvesting parameters to improve the speed and accuracy. The harvesting parameter set comprises three main components. First, the depth-scale of being harvested final pages containing desired information counted from the first page at the targeted URLs. Secondly, the focus-point number to determine the exact box containing relevant information. Lastly, the combination of keywords to recognize encountered hyperlinks of relevant images or full-texts embedded in those final pages. All parameters are accessible and fully customizable for each target by the administrators of participating institutions over an integrated web interface. A real implementation to the Indonesian Scientific Index which covers all scientific information across Indonesia is also briefly introduced.

قيم البحث

171 - Zhiyu Chen , Shuo Zhang , Brian D. Davison 2021

We describe the development, characteristics and availability of a test collection for the task of Web table retrieval, which uses a large-scale Web Table Corpora extracted from the Common Crawl. Since a Web table usually has rich context information such as the page title and surrounding paragraphs, we not only provide relevance judgments of query-table pairs, but also the relevance judgments of query-table context pairs with respect to a query, which are ignored by previous test collections. To facilitate future research with this benchmark, we provide details about how the dataset is pre-processed and also baseline results from both traditional and recently proposed table retrieval methods. Our experimental results show that proper usage of context labels can benefit previous table retrieval methods.

استرجاع المعلومات

A Study of CAPTCHAs for Securing Web Services

348 - M. Tariq Banday , N. A. Shah 2011

Atomizing various Web activities by replacing human to human interactions on the Internet has been made indispensable due to its enormous growth. However, bots also known as Web-bots which have a malicious intend and pretending to be humans pose a se vere threat to various services on the Internet that implicitly assume a human interaction. Accordingly, Web service providers before allowing access to such services use various Human Interaction Proofs (HIPs) to authenticate that the user is a human and not a bot. Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) is a class of HIPs tests and are based on Artificial Intelligence. These tests are easier for humans to qualify and tough for bots to simulate. Several Web services use CAPTCHAs as a defensive mechanism against automated Web-bots. In this paper, we review the existing CAPTCHA schemes that have been proposed or are being used to protect various Web services. We classify them in groups and compare them with each other in terms of security and usability. We present general method used to generate and break text-based and image-based CAPTCHAs. Further, we discuss various security and usability issues in CAPTCHA design and provide guidelines for improving their robustness and usability.

التشفير والأمن أجهزة الكمبيوتر والمجتمع

ReadNet: A Hierarchical Transformer Framework for Web Article Readability Analysis

113 - Changping Meng , Muhao Chen , Jie Mao 2021

Analyzing the readability of articles has been an important sociolinguistic task. Addressing this task is necessary to the automatic recommendation of appropriate articles to readers with different comprehension abilities, and it further benefits edu cation systems, web information systems, and digital libraries. Current methods for assessing readability employ empirical measures or statistical learning techniques that are limited by their ability to characterize complex patterns such as article structures and semantic meanings of sentences. In this paper, we propose a new and comprehensive framework which uses a hierarchical self-attention model to analyze document readability. In this model, measurements of sentence-level difficulty are captured along with the semantic meanings of each sentence. Additionally, the sentence-level features are incorporated to characterize the overall readability of an article with consideration of article structures. We evaluate our proposed approach on three widely-used benchmark datasets against several strong baseline approaches. Experimental results show that our proposed method achieves the state-of-the-art performance on estimating the readability for various web articles and literature.

استرجاع المعلومات تفاعل الإنسان والحاسوب

Context Models For Web Search Personalization

562 - Maksims Volkovs 2015

We present our solution to the Yandex Personalized Web Search Challenge. The aim of this challenge was to use the historical search logs to personalize top-N document rankings for a set of test users. We used over 100 features extracted from user- an d query-depended contexts to train neural net and tree-based learning-to-rank and regression models. Our final submission, which was a blend of several different models, achieved an NDCG@10 of 0.80476 and placed 4th amongst the 194 teams winning 3rd prize.

استرجاع المعلومات

Staging Transformations for Multimodal Web Interaction Management

95 - Michael Narayan , Chris Williams , Saverio Perugini 2003

Multimodal interfaces are becoming increasingly ubiquitous with the advent of mobile devices, accessibility considerations, and novel software technologies that combine diverse interaction media. In addition to improving access and delivery capabilit ies, such interfaces enable flexible and personalized dialogs with websites, much like a conversation between humans. In this paper, we present a software framework for multimodal web interaction management that supports mixed-initiative dialogs between users and websites. A mixed-initiative dialog is one where the user and the website take turns changing the flow of interaction. The framework supports the functional specification and realization of such dialogs using staging transformations -- a theory for representing and reasoning about dialogs based on partial input. It supports multiple interaction interfaces, and offers sessioning, caching, and co-ordination functions through the use of an interaction manager. Two case studies are presented to illustrate the promise of this approach.

استرجاع المعلومات لغات البرمجة

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة حلب

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Simple Mechanism for Focused Web-harvesting

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً