ﻻ يوجد ملخص باللغة العربية
Violence is an epidemic in Brazil and a problem on the rise world-wide. Mobile devices provide communication technologies which can be used to monitor and alert about violent situations. However, current solutions, like panic buttons or safe words, might increase the loss of life in violent situations. We propose an embedded artificial intelligence solution, using natural language and speech processing technology, to silently alert someone who can help in this situation. The corpus used contains 400 positive phrases and 800 negative phrases, totaling 1,200 sentences which are classified using two well-known extraction methods for natural language processing tasks: bag-of-words and word embeddings and classified with a support vector machine. We describe the proof-of-concept product in development with promising results, indicating a path towards a commercial product. More importantly we show that model improvements via word embeddings and data augmentation techniques provide an intrinsically robust model. The final embedded solution also has a small footprint of less than 10 MB.
Visual speech recognition (VSR) is the task of recognizing spoken language from video input only, without any audio. VSR has many applications as an assistive technology, especially if it could be deployed in mobile devices and embedded systems. The
Language is inherent and compulsory for human communication. Whether expressed in a written or spoken way, it ensures understanding between people of the same and different regions. With the growing awareness and effort to include more low-resourced
Language modeling (LM) for automatic speech recognition (ASR) does not usually incorporate utterance level contextual information. For some domains like voice assistants, however, additional context, such as the time at which an utterance was spoken,
End-to-end automatic speech recognition (ASR) systems are increasingly popular due to their relative architectural simplicity and competitive performance. However, even though the average accuracy of these systems may be high, the performance on rare
The recent developments in technology have re-warded us with amazing audio synthesis models like TACOTRON and WAVENETS. On the other side, it poses greater threats such as speech clones and deep fakes, that may go undetected. To tackle these alarming