FAIR: A Hadoop-based Hybrid Model for Faculty Information Retrieval System

44 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Harishchandra Dubey

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Noopur Gupta - Rakesh K. Lenka - Rabindra K. Barik

النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In era of ever-expanding data and knowledge, we lack a centralized system that maps all the faculties to their research works. This problem has not been addressed in the past and it becomes challenging for students to connect with the right faculty of their domain. Since we have so many colleges and faculties this lies in the category of big data problem. In this paper, we present a model which works on the distributed computing environment to tackle big data. The proposed model uses apache spark as an execution engine and hive as database. The results are visualized with the help of Tableau that is connected to Apache Hive to achieve distributed computing.

قيم البحث

268 - Ruoyuan Gao , Yingqiang Ge , Chirag Shah 2021

With the emerging needs of creating fairness-aware solutions for search and recommendation systems, a daunting challenge exists of evaluating such solutions. While many of the traditional information retrieval (IR) metrics can capture the relevance, diversity and novelty for the utility with respect to users, they are not suitable for inferring whether the presented results are fair from the perspective of responsible information exposure. On the other hand, various fairness metrics have been proposed but they do not account for the user utility or do not measure it adequately. To address this problem, we propose a new metric called Fairness-Aware IR (FAIR). By unifying standard IR metrics and fairness measures into an integrated metric, this metric offers a new perspective for evaluating fairness-aware ranking results. Based on this metric, we developed an effective ranking algorithm that jointly optimized user utility and fairness. The experimental results showed that our FAIR metric could highlight results with good user utility and fair information exposure. We showed how FAIR related to existing metrics and demonstrated the effectiveness of our FAIR-based algorithm. We believe our work opens up a new direction of pursuing a computationally feasible metric for evaluating and implementing the fairness-aware IR systems.

استرجاع المعلومات

A Programming Model for Hybrid Workflows: combining Task-based Workflows and Dataflows all-in-one

122 - Cristian Ramon-Cortes , Francesc Lordan , Jorge Ejarque 2020

This paper tries to reduce the effort of learning, deploying, and integrating several frameworks for the development of e-Science applications that combine simulations with High-Performance Data Analytics (HPDA). We propose a way to extend task-based management systems to support continuous input and output data to enable the combination of task-based workflows and dataflows (Hybrid Workflows from now on) using a single programming model. Hence, developers can build complex Data Science workflows with different approaches depending on the requirements. To illustrate the capabilities of Hybrid Workflows, we have built a Distributed Stream Library and a fully functional prototype extending COMPSs, a mature, general-purpose, task-based, parallel programming model. The library can be easily integrated with existing task-based frameworks to provide support for dataflows. Also, it provides a homogeneous, generic, and simple representation of object and file streams in both Java and Python; enabling complex workflows to handle any data type without dealing directly with the streaming back-end.

النظم الموزعة والتوازية والحوسبة العنقودية

Information Retrieval and Recommendation System for Astronomical Observatories

110 - Nikhil Mukund , Saurabh Thakur , Sheelu Abraham 2017

We present a machine learning based information retrieval system for astronomical observatories that tries to address user defined queries related to an instrument. In the modern instrumentation scenario where heterogeneous systems and talents are si multaneously at work, the ability to supply with the right information helps speeding up the detector maintenance operations. Enhancing the detector uptime leads to increased coincidence observation and improves the likelihood for the detection of astrophysical signals. Besides, such efforts will efficiently disseminate technical knowledge to a wider audience and will help the ongoing efforts to build upcoming detectors like the LIGO-India etc even at the design phase to foresee possible challenges. The proposed method analyses existing documented efforts at the site to intelligently group together related information to a query and to present it on-line to the user. The user in response can further go into interesting links and find already developed solutions or probable ways to address the present situation optimally. A web application that incorporates the above idea has been implemented and tested for LIGO Livingston, LIGO Hanford and Virgo observatories.

الأجهزة والأساليب للزيئات الفيزياء الفلكية

Adapting Binary Information Retrieval Evaluation Metrics for Segment-based Retrieval Tasks

362 - Robin Aly , Maria Eskevich , Roeland Ordelman 2013

This report describes metrics for the evaluation of the effectiveness of segment-based retrieval based on existing binary information retrieval metrics. This metrics are described in the context of a task for the hyperlinking of video segments. This evaluation approach re-uses existing evaluation measures from the standard Cranfield evaluation paradigm. Our adaptation approach can in principle be used with any kind of effectiveness measure that uses binary relevance, and for other segment-baed retrieval tasks. In our video hyperlinking setting, we use precision at a cut-off rank n and mean average precision.

استرجاع المعلومات

SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval

87 - Liang Pang , Jun Xu , Qingyao Ai 2019

In learning-to-rank for information retrieval, a ranking model is automatically learned from the data and then utilized to rank the sets of retrieved documents. Therefore, an ideal ranking model would be a mapping from a document set to a permutation on the set, and should satisfy two critical requirements: (1)~it should have the ability to model cross-document interactions so as to capture local context information in a query; (2)~it should be permutation-invariant, which means that any permutation of the inputted documents would not change the output ranking. Previous studies on learning-to-rank either design uni-variate scoring functions that score each document separately, and thus failed to model the cross-document interactions; or construct multivariate scoring functions that score documents sequentially, which inevitably sacrifice the permutation invariance requirement. In this paper, we propose a neural learning-to-rank model called SetRank which directly learns a permutation-invariant ranking model defined on document sets of any size. SetRank employs a stack of (induced) multi-head self attention blocks as its key component for learning the embeddings for all of the retrieved documents jointly. The self-attention mechanism not only helps SetRank to capture the local context information from cross-document interactions, but also to learn permutation-equivariant representations for the inputted documents, which therefore achieving a permutation-invariant ranking model. Experimental results on three large scale benchmarks showed that the SetRank significantly outperformed the baselines include the traditional learning-to-rank models and state-of-the-art Neural IR models.

استرجاع المعلومات

سجل دخول لتتمكن من نشر تعليقات