بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

GAVIN: Gaze-Assisted Voice-Based Implicit Note-taking

282 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Anam Ahmad Khan

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Anam Ahmad Khan - Joshua Newn - Ryan Kelly

تفاعل الإنسان والحاسوب

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Annotation is an effective reading strategy people often undertake while interacting with digital text. It involves highlighting pieces of text and making notes about them. Annotating while reading in a desktop environment is considered trivial but, in a mobile setting where people read while hand-holding devices, the task of highlighting and typing notes on a mobile display is challenging. In this paper, we introduce GAVIN, a gaze-assisted voice note-taking application, which enables readers to seamlessly take voice notes on digital documents by implicitly anchoring them to text passages. We first conducted a contextual enquiry focusing on participants note-taking practices on digital documents. Using these findings, we propose a method which leverages eye-tracking and machine learning techniques to annotate voice notes with reference text passages. To evaluate our approach, we recruited 32 participants performing voice note-taking. Following, we trained a classifier on the data collected to predict text passage where participants made voice notes. Lastly, we employed the classifier to built GAVIN and conducted a user study to demonstrate the feasibility of the system. This research demonstrates the feasibility of using gaze as a resource for implicit anchoring of voice notes, enabling the design of systems that allow users to record voice notes with minimal effort and high accuracy.

قيم البحث

101 - Xucong Zhang , Yusuke Sugano , Andreas Bulling 2019

Appearance-based gaze estimation methods that only require an off-the-shelf camera have significantly improved but they are still not yet widely used in the human-computer interaction (HCI) community. This is partly because it remains unclear how the y perform compared to model-based approaches as well as dominant, special-purpose eye tracking equipment. To address this limitation, we evaluate the performance of state-of-the-art appearance-based gaze estimation for interaction scenarios with and without personal calibration, indoors and outdoors, for different sensing distances, as well as for users with and without glasses. We discuss the obtained findings and their implications for the most important gaze-based applications, namely explicit eye input, attentive user interfaces, gaze-based user modelling, and passive eye monitoring. To democratise the use of appearance-based gaze estimation and interaction in HCI, we finally present OpenGaze (www.opengaze.org), the first software toolkit for appearance-based gaze estimation and interaction.

تفاعل الإنسان والحاسوب

Gaze Estimation for Assisted Living Environments

82 - Philipe A. Dias , Damiano Malafronte , Henry Medeiros 2019

Effective assisted living environments must be able to perform inferences on how their occupants interact with one another as well as with surrounding objects. To accomplish this goal using a vision-based automated approach, multiple tasks such as po se estimation, object segmentation and gaze estimation must be addressed. Gaze direction in particular provides some of the strongest indications of how a person interacts with the environment. In this paper, we propose a simple neural network regressor that estimates the gaze direction of individuals in a multi-camera assisted living scenario, relying only on the relative positions of facial keypoints collected from a single pose estimation model. To handle cases of keypoint occlusion, our model exploits a novel confidence gated unit in its input layer. In addition to the gaze direction, our model also outputs an estimation of its own prediction uncertainty. Experimental results on a public benchmark demonstrate that our approach performs on pair with a complex, dataset-specific baseline, while its uncertainty predictions are highly correlated to the actual angular error of corresponding estimations. Finally, experiments on images from a real assisted living environment demonstrate the higher suitability of our model for its final application.

الرؤية الحاسوبية وتمييز الأنماط معالجة الصور والفيديو

VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

56 - Xingbo Wang , Haipeng Zeng , Yong Wang 2020

The modulation of voice properties, such as pitch, volume, and speed, is crucial for delivering a successful public speech. However, it is challenging to master different voice modulation skills. Though many guidelines are available, they are often n ot practical enough to be applied in different public speaking situations, especially for novice speakers. We present VoiceCoach, an interactive evidence-based approach to facilitate the effective training of voice modulation skills. Specifically, we have analyzed the voice modulation skills from 2623 high-quality speeches (i.e., TED Talks) and use them as the benchmark dataset. Given a voice input, VoiceCoach automatically recommends good voice modulation examples from the dataset based on the similarity of both sentence structures and voice modulation skills. Immediate and quantitative visual feedback is provided to guide further improvement. The expert interviews and the user study provide support for the effectiveness and usability of VoiceCoach.

تفاعل الإنسان والحاسوب الحساب واللغة استرجاع المعلومات

LookAtChat: Visualizing Gaze Awareness for Remote Small-Group Conversations

93 - Zhenyi He , Ruofei Du , Ken Perlin 2021

Video conferences play a vital role in our daily lives. However, many nonverbal cues are missing, including gaze and spatial information. We introduce LookAtChat, a web-based video conferencing system, which empowers remote users to identify gaze awa reness and spatial relationships in small-group conversations. Leveraging real-time eye-tracking technology available with ordinary webcams, LookAtChat tracks each users gaze direction, identifies who is looking at whom, and provides corresponding spatial cues. Informed by formative interviews with 5 participants who regularly use videoconferencing software, we explored the design space of gaze visualization in both 2D and 3D layouts. We further conducted an exploratory user study (N=20) to evaluate LookAtChat in three conditions: baseline layout, 2D directional layout, and 3D perspective layout. Our findings demonstrate how LookAtChat engages participants in small-group conversations, how gaze and spatial information improve conversation quality, and the potential benefits and challenges to incorporating gaze awareness visualization into existing videoconferencing systems.

تفاعل الإنسان والحاسوب

Gaze-Contingent Retinal Speckle Suppression for Perceptually-Matched Foveated Holographic Displays

89 - Praneeth Chakravarthula , Zhan Zhang , Okan Tursun 2021

Computer-generated holographic (CGH) displays show great potential and are emerging as the next-generation displays for augmented and virtual reality, and automotive heads-up displays. One of the critical problems harming the wide adoption of such di splays is the presence of speckle noise inherent to holography, that compromises its quality by introducing perceptible artifacts. Although speckle noise suppression has been an active research area, the previous works have not considered the perceptual characteristics of the Human Visual System (HVS), which receives the final displayed imagery. However, it is well studied that the sensitivity of the HVS is not uniform across the visual field, which has led to gaze-contingent rendering schemes for maximizing the perceptual quality in various computer-generated imagery. Inspired by this, we present the first method that reduces the perceived speckle noise by integrating foveal and peripheral vision characteristics of the HVS, along with the retinal point spread function, into the phase hologram computation. Specifically, we introduce the anatomical and statistical retinal receptor distribution into our computational hologram optimization, which places a higher priority on reducing the perceived foveal speckle noise while being adaptable to any individuals optical aberration on the retina. Our method demonstrates superior perceptual quality on our emulated holographic display. Our evaluations with objective measurements and subjective studies demonstrate a significant reduction of the human perceived noise.

تفاعل الإنسان والحاسوب الوسائط المتعددة معالجة الصور والفيديو

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة المأمون الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

GAVIN: Gaze-Assisted Voice-Based Implicit Note-taking

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً