Targeted evaluations have found that machine translation systems often output incorrect gender in translations, even when the gender is clear from context. Furthermore, these incorrectly gendered translations have the potential to reflect or amplify
social biases. We propose gender-filtered self-training (GFST) to improve gender translation accuracy on unambiguously gendered inputs. Our GFST approach uses a source monolingual corpus and an initial model to generate gender-specific pseudo-parallel corpora which are then filtered and added to the training data. We evaluate GFST on translation from English into five languages, finding that it improves gender accuracy without damaging generic quality. We also show the viability of GFST on several experimental settings, including re-training from scratch, fine-tuning, controlling the gender balance of the data, forward translation, and back-translation.
Terminological consistency is an essential requirement for industrial translation. High-quality, hand-crafted terminologies contain entries in their nominal forms. Integrating such a terminology into machine translation is not a trivial task. The MT
system must be able to disambiguate homographs on the source side and choose the correct wordform on the target side. In this work, we propose a simple but effective method for homograph disambiguation and a method of wordform selection by introducing multi-choice lexical constraints. We also propose a metric to measure the terminological consistency of the translation. Our results have a significant improvement over the current SOTA in terms of terminological consistency without any loss of the BLEU score. All the code used in this work will be published as open-source.
Classical information retrieval systems such as BM25 rely on exact lexical match and can carry out search efficiently with inverted list index. Recent neural IR models shifts towards soft matching all query document terms, but they lose the computati
on efficiency of exact match systems. This paper presents COIL, a contextualized exact match retrieval architecture, where scoring is based on overlapping query document tokens' contextualized representations. The new architecture stores contextualized token representations in inverted lists, bringing together the efficiency of exact match and the representation power of deep language models. Our experimental results show COIL outperforms classical lexical retrievers and state-of-the-art deep LM retrievers with similar or smaller latency.
تتعرض الليبيدات كما تتعرض أغلب مكونات الدقيق إلى جملة من التغيرات أثناء التخزين ينعكس ذلك على الخصائص الفزيائية لللعجين وعلى نوعية الخبز الناتج
the aim of the research was to use different types of
flour in terms of extraction rates (70-80-90%) and then study the
specifications of these types of flour and the specifications of the dough
and bread produced.
Efficiency of dried leaves’s essential oil English lavender lavandula
angustifolia L. against (12-14 day) old larvae of Tribolium
castaneum was examined as fumigant under laboratory conditions,
Five concentration of essential oil was Tested.
Precise point positioning technique uses recursive algorithms to solve the navigation
problem. In fact, traditional least square method doesn’t meet the requiredassessments of
processing speed, and quality in different geodetic and surveying applic
ations,due tobig
amount of output processing data provided by global navigation satellite systems.
Extended Kalman filter is considered as optimal solution approach of the navigation
problem. This filter requiresthe knowledge of measurements, its observational models, and
physical state for estimation problem like: (receiver dynamic, received signals characters,
and suitable estimation of its initial conditions).
Research refers to a mathematical suggestion, which reduce the negative effect of
convergence time at EKF initial conditions. This work also shows how a position
estimation accuracy affected by the suggested modification of using EKF in PPP, and
supporting the use ofthis modification in position estimation field, in spite of increasing
processing time.
إن خصائص غلوتين القمح الكمية والنوعية من أهم مؤشرات الجودة للدقيق حيث تعد و إلى حد كبير العامل المحدد للاستخدام النهائي للدقيق
A random sample of the kinds of bread consumed in the coastal region during the
years of research was taken, the percentage of fiber and protein was calculated, and the
effect of the mixture approved on an annual basis in mills on purveyance flour
content of
fiber and protein was studied.
The study showed the important role of the mixture, where the percentage of fiber
increased from (1.06%) in 2009 to (1.61%) in 2010, and the percentage of protein
increased also from (11.36%) in 2010 to (13.90%) in 2012. The results show that there are
some, but not all, governmental mills which add soft bran fiber and protein-rich flour,
taking into consideration the impact of technological processes applied throughout the
stages of manufacturing bread, and in particular the stages of fermentation and broil.
Ziziphora canescens is a species of important medicinal plants in Syria due
to its medicinal properties as antibiotic, flavors and spices in various foods.
This plant is important, especially in folk medicine in some areas (Kalamoon)
on the one ha
nd, and retreat of its spread which may lead eventually to its
extinction on the other hand, so a protocol for rapid micropropagation has
been developing by using lateral and apical buds on nutrient media MS
supplemented with different types and concentrations of plant growth
regulators.