ExCode-Mixed: Explainable Approaches towards Sentiment Analysis on Code-Mixed Data using BERT models

319 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Aman Priyanshu

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Aman Priyanshu - Aleti Vardhan - Sudarshan Sivakumar

الذكاء الاصطناعي الحساب واللغة التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The increasing use of social media sites in countries like India has given rise to large volumes of code-mixed data. Sentiment analysis of this data can provide integral insights into peoples perspectives and opinions. Developing robust explainability techniques which explain why models make their predictions becomes essential. In this paper, we propose an adequate methodology to integrate explainable approaches into code-mixed sentiment analysis.

قيم البحث

106 - Andrew J Sedgewick , Joseph D. Ramsey , Peter Spirtes 2017

Graphical causal models are an important tool for knowledge discovery because they can represent both the causal relations between variables and the multivariate probability distributions over the data. Once learned, causal graphs can be used for cla ssification, feature selection and hypothesis generation, while revealing the underlying causal network structure and thus allowing for arbitrary likelihood queries over the data. However, current algorithms for learning sparse directed graphs are generally designed to handle only one type of data (continuous-only or discrete-only), which limits their applicability to a large class of multi-modal biological datasets that include mixed type variables. To address this issue, we developed new methods that modify and combine existing methods for finding undirected graphs with methods for finding directed graphs. These hybrid methods are not only faster, but also perform better than the directed graph estimation methods alone for a variety of parameter settings and data set sizes. Here, we describe a new conditional independence test for learning directed graphs over mixed data types and we compare performances of different graph learning strategies on synthetic data.

الذكاء الاصطناعي التعلم الالي

Towards French Smart Building Code: Compliance Checking Based on Semantic Rules

436 - Nicolas Bus 2019

Manually checking models for compliance against building regulation is a time-consuming task for architects and construction engineers. There is thus a need for algorithms that process information from construction projects and report non-compliant e lements. Still automated code-compliance checking raises several obstacles. Building regulations are usually published as human readable texts and their content is often ambiguous or incomplete. Also, the vocabulary used for expressing such regulations is very different from the vocabularies used to express Building Information Models (BIM). Furthermore, the high level of details associated to BIM-contained geometries induces complex calculations. Finally, the level of complexity of the IFC standard also hinders the automation of IFC processing tasks. Model chart, formal rules and pre-processors approach allows translating construction regulations into semantic queries. We further demonstrate the usefulness of this approach through several use cases. We argue our approach is a step forward in bridging the gap between regulation texts and automated checking algorithms. Finally with the recent building ontology BOT recommended by the W3C Linked Building Data Community Group, we identify perspectives for standardizing and extending our approach.

الذكاء الاصطناعي الحساب واللغة المنطق في علوم الحاسوب

DravidianCodeMix: Sentiment Analysis and Offensive Language Identification Dataset for Dravidian Languages in Code-Mixed Text

109 - Bharathi Raja Chakravarthi , Ruba Priyadharshini , Vigneshwarann Muralidaran 2021

This paper describes the development of a multilingual, manually annotated dataset for three under-resourced Dravidian languages generated from social media comments. The dataset was annotated for sentiment analysis and offensive language identificat ion for a total of more than 60,000 YouTube comments. The dataset consists of around 44,000 comments in Tamil-English, around 7,000 comments in Kannada-English, and around 20,000 comments in Malayalam-English. The data was manually annotated by volunteer annotators and has a high inter-annotator agreement in Krippendorffs alpha. The dataset contains all types of code-mixing phenomena since it comprises user-generated content from a multilingual country. We also present baseline experiments to establish benchmarks on the dataset using machine learning methods. The dataset is available on Github (https://github.com/bharathichezhiyan/DravidianCodeMix-Dataset) and Zenodo (https://zenodo.org/record/4750858#.YJtw0SYo_0M).

الحساب واللغة

On the Relationship Between KR Approaches for Explainable Planning

177 - Stylianos Loukas Vasileiou , William Yeoh , Tran Cao Son 2020

In this paper, we build upon notions from knowledge representation and reasoning (KR) to expand a preliminary logic-based framework that characterizes the model reconciliation problem for explainable planning. We also provide a detailed exposition on the relationship between similar KR techniques, such as abductive explanations and belief change, and their applicability to explainable planning.

الذكاء الاصطناعي

Towards Automated Fatigue Assessment using Wearable Sensing and Mixed-Effects Models

144 - Yang Bai , Yu Guan , Jian Qing Shi 2021

Fatigue is a broad, multifactorial concept that includes the subjective perception of reduced physical and mental energy levels. It is also one of the key factors that strongly affect patients health-related quality of life. To date, most fatigue ass essment methods were based on self-reporting, which may suffer from many factors such as recall bias. To address this issue, in this work, we recorded multi-modal physiological data (including ECG, accelerometer, skin temperature and respiratory rate, as well as demographic information such as age, BMI) in free-living environments and developed automated fatigue assessment models. Specifically, we extracted features from each modality and employed the random forest-based mixed-effects models, which can take advantage of the demographic information for improved performance. We conducted experiments on our collected dataset, and very promising preliminary results were achieved. Our results suggested ECG played an important role in the fatigue assessment tasks.

تفاعل الإنسان والحاسوب