ﻻ يوجد ملخص باللغة العربية
The goal of this paper is the automatic identification of characters in TV and feature film material. In contrast to standard approaches to this task, which rely on the weak supervision afforded by transcripts and subtitles, we propose a new method requiring only a cast list. This list is used to obtain images of actors from freely available sources on the web, providing a form of partial supervision for this task. In using images of actors to recognize characters, we make the following three contributions: (i) We demonstrate that an automated semi-supervised learning approach is able to adapt from the actors face to the characters face, including the face context of the hair; (ii) By building voice models for every character, we provide a bridge between frontal faces (for which there is plenty of actor-level supervision) and profile (for which there is very little or none); and (iii) by combining face context and speaker identification, we are able to identify characters with partially occluded faces and extreme facial poses. Results are presented on the TV series Sherlock and the feature film Casablanca. We achieve the state-of-the-art on the Casablanca benchmark, surpassing previous methods that have used the stronger supervision available from transcripts.
We describe a novel line-level script identification method. Previous work repurposed an OCR model generating per-character script codes, counted to obtain line-level script identification. This has two shortcomings. First, as a sequence-to-sequence
Handwritten character recognition (HCR) is a challenging learning problem in pattern recognition, mainly due to similarity in structure of characters, different handwriting styles, noisy datasets and a large variety of languages and scripts. HCR prob
Todays popular TV series tend to develop continuous, complex plots spanning several seasons, but are often viewed in controlled and discontinuous conditions. Consequently, most viewers need to be re-immersed in the story before watching a new season.
Multi-lingual script identification is a difficult task consisting of different language with complex backgrounds in scene text images. According to the current research scenario, deep neural networks are employed as teacher models to train a smaller
For over a decade, TV series have been drawing increasing interest, both from the audience and from various academic fields. But while most viewers are hooked on the continuous plots of TV serials, the few annotated datasets available to researchers