Do you want to publish a course? Click here

A Simple Voting Mechanism for Online Sexist Content Identification

92   0   0.0 ( 0 )
 Added by Chao Feng
 Publication date 2021
and research's language is English
 Authors Chao Feng




Ask ChatGPT about the research

This paper presents the participation of the MiniTrue team in the EXIST 2021 Challenge on the sexism detection in social media task for English and Spanish. Our approach combines the language models with a simple voting mechanism for the sexist label prediction. For this, three BERT based models and a voting function are used. Experimental results show that our final model with the voting function has achieved the best results among our four models, which means that our voting mechanism brings an extra benefit to our system. Nevertheless, we also observe that our system is robust to data sources and languages.



rate research

Read More

Online petitions are a cost-effective way for citizens to collectively engage with policy-makers in a democracy. Predicting the popularity of a petition --- commonly measured by its signature count --- based on its textual content has utility for policy-makers as well as those posting the petition. In this work, we model this task using CNN regression with an auxiliary ordinal regression objective. We demonstrate the effectiveness of our proposed approach using UK and US government petition datasets.
219 - Z. Akbar , L.T. Handoko 2008
The focused web-harvesting is deployed to realize an automated and comprehensive index databases as an alternative way for virtual topical data integration. The web-harvesting has been implemented and extended by not only specifying the targeted URLs, but also predefining human-edited harvesting parameters to improve the speed and accuracy. The harvesting parameter set comprises three main components. First, the depth-scale of being harvested final pages containing desired information counted from the first page at the targeted URLs. Secondly, the focus-point number to determine the exact box containing relevant information. Lastly, the combination of keywords to recognize encountered hyperlinks of relevant images or full-texts embedded in those final pages. All parameters are accessible and fully customizable for each target by the administrators of participating institutions over an integrated web interface. A real implementation to the Indonesian Scientific Index which covers all scientific information across Indonesia is also briefly introduced.
The proliferation of harmful content on online social media platforms has necessitated empirical understandings of experiences of harm online and the development of practices for harm mitigation. Both understandings of harm and approaches to mitigating that harm, often through content moderation, have implicitly embedded frameworks of prioritization - what forms of harm should be researched, how policy on harmful content should be implemented, and how harmful content should be moderated. To aid efforts of better understanding the variety of online harms, how they relate to one another, and how to prioritize harms relevant to research, policy, and practice, we present a theoretical framework of severity for harmful online content. By employing a grounded theory approach, we developed a framework of severity based on interviews and card-sorting activities conducted with 52 participants over the course of ten months. Through our analysis, we identified four Types of Harm (physical, emotional, relational, and financial) and eight Dimensions along which the severity of harm can be understood (perspectives, intent, agency, experience, scale, urgency, vulnerability, sphere). We describe how our framework can be applied to both research and policy settings towards deeper understandings of specific forms of harm (e.g., harassment) and prioritization frameworks when implementing policies encompassing many forms of harm.
System combination is an important technique for combining the hypotheses of different machine translation systems to improve translation performance. Although early statistical approaches to system combination have been proven effective in analyzing the consensus between hypotheses, they suffer from the error propagation problem due to the use of pipelines. While this problem has been alleviated by end-to-end training of multi-source sequence-to-sequence models recently, these neural models do not explicitly analyze the relations between hypotheses and fail to capture their agreement because the attention to a word in a hypothesis is calculated independently, ignoring the fact that the word might occur in multiple hypotheses. In this work, we propose an approach to modeling voting for system combination in machine translation. The basic idea is to enable words in hypotheses from different systems to vote on words that are representative and should get involved in the generation process. This can be done by quantifying the influence of each voter and its preference for each candidate. Our approach combines the advantages of statistical and neural methods since it can not only analyze the relations between hypotheses but also allow for end-to-end training. Experiments show that our approach is capable of better taking advantage of the consensus between hypotheses and achieves significant improvements over state-of-the-art baselines on Chinese-English and English-German machine translation tasks.
Recently, several universal methods have been proposed for online convex optimization, and attain minimax rates for multiple types of convex functions simultaneously. However, they need to design and optimize one surrogate loss for each type of functions, which makes it difficult to exploit the structure of the problem and utilize the vast amount of existing algorithms. In this paper, we propose a simple strategy for universal online convex optimization, which avoids these limitations. The key idea is to construct a set of experts to process the original online functions, and deploy a meta-algorithm over the emph{linearized} losses to aggregate predictions from experts. Specifically, we choose Adapt-ML-Prod to track the best expert, because it has a second-order bound and can be used to leverage strong convexity and exponential concavity. In this way, we can plug in off-the-shelf online solvers as black-box experts to deliver problem-dependent regret bounds. Furthermore, our strategy inherits the theoretical guarantee of any expert designed for strongly convex functions and exponentially concave functions, up to a double logarithmic factor. For general convex functions, it maintains the minimax optimality and also achieves a small-loss bound.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا