ترغب بنشر مسار تعليمي؟ اضغط هنا

Counting Everyday Objects in Everyday Scenes

76   0   0.0 ( 0 )
 تاريخ النشر 2016
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We are interested in counting the number of instances of object classes in natural, everyday images. Previous counting approaches tackle the problem in restricted domains such as counting pedestrians in surveillance videos. Counts can also be estimated from outputs of other vision tasks like object detection. In this work, we build dedicated models for counting designed to tackle the large variance in counts, appearances, and scales of objects found in natural scenes. Our approach is inspired by the phenomenon of subitizing - the ability of humans to make quick assessments of counts given a perceptual signal, for small count values. Given a natural scene, we employ a divide and conquer strategy while incorporating context across the scene to adapt the subitizing idea to counting. Our approach offers consistent improvements over numerous baseline approaches for counting on the PASCAL VOC 2007 and COCO datasets. Subsequently, we study how counting can be used to improve object detection. We then show a proof of concept application of our counting methods to the task of Visual Question Answering, by studying the `how many? questions in the VQA and COCO-QA datasets.



قيم البحث

اقرأ أيضاً

Designing future IoT ecosystems requires new approaches and perspectives to understand everyday practices. While researchers recognize the importance of understanding social aspects of everyday objects, limited studies have explored the possibilities of combining data-driven patterns with human interpretations to investigate emergent relationships among objects. This work presents Thing Constellation Visualizer (thingCV), a novel interactive tool for visualizing the social network of objects based on their co-occurrence as computed from a large collection of photos. ThingCV enables perspective-changing design explorations over the network of objects with scalable links. Two exploratory workshops were conducted to investigate how designers navigate and make sense of a network of objects through thingCV. The results of eight participants showed that designers were actively engaged in identifying interesting objects and their associated clusters of related objects. The designers projected social qualities onto the identified objects and their communities. Furthermore, the designers changed their perspectives to revisit familiar contexts and to generate new insights through the exploration process. This work contributes a novel approach to combining data-driven models with designerly interpretations of thing constellation towards More-Than Human-Centred Design of IoT ecosystems.
The multiple and pervasive forms of exclusion remain understudied in the emergent everyday segregation literature mainly centered on a single social dimension from a single-city focus. From mobility surveys compiled together (385,000 respondents and 1,711,000 trips) and covering 60% of Frances population, we explore mismatch in hourly population profiles in 2,572 districts with an intersectional point of view. It is especially in areas with strong increase or decrease of population during the day that hourly profiles are found not only to combine the largest dissimilarities within gender, age and educational subgroups but also to be widely more synchronous among dominates (men, middle-age and high educated groups) than among subordinates subgroups (women, elderly and low educated groups). These intersectional space-time patterns provide empirical keys to broaden the scope of exclusion and segregation literature to the times and places when and where peers are well-placed to join forces.
The interaction between the vestibular and ocular system has primarily been studied in controlled environments. Consequently, off-the shelf tools for categorization of gaze events (e.g. fixations, pursuits, saccade) fail when head movements are allow ed. Our approach was to collect a novel, naturalistic, and multimodal dataset of eye+head movements when subjects performed everyday tasks while wearing a mobile eye tracker equipped with an inertial measurement unit and a 3D stereo camera. This Gaze-in-the-Wild dataset (GW) includes eye+head rotational velocities (deg/s), infrared eye images and scene imagery (RGB+D). A portion was labelled by coders into gaze motion events with a mutual agreement of 0.72 sample based Cohens $kappa$. This labelled data was used to train and evaluate two machine learning algorithms, Random Forest and a Recurrent Neural Network model, for gaze event classification. Assessment involved the application of established and novel event based performance metrics. Classifiers achieve $sim$90$%$ human performance in detecting fixations and saccades but fall short (60$%$) on detecting pursuit movements. Moreover, pursuit classification is far worse in the absence of head movement information. A subsequent analysis of feature significance in our best-performing model revealed a reliance upon absolute eye and head velocity, indicating that classification does not require spatial alignment of the head and eye tracking coordinate systems. The GW dataset, trained classifiers and evaluation metrics will be made publicly available with the intention of facilitating growth in the emerging area of head-free gaze event classification.
Since stress contributes to a broad range of mental and physical health problems, the objective assessment of stress is essential for behavioral and physiological studies. Although several studies have evaluated stress levels in controlled settings, objective stress assessment in everyday settings is still largely under-explored due to challenges arising from confounding contextual factors and limited adherence for self-reports. In this paper, we explore the objective prediction of stress levels in everyday settings based on heart rate (HR) and heart rate variability (HRV) captured via low-cost and easy-to-wear photoplethysmography (PPG) sensors that are widely available on newer smart wearable devices. We present a layered system architecture for personalized stress monitoring that supports a tunable collection of data samples for labeling, and present a method for selecting informative samples from the stream of real-time data for labeling. We captured the stress levels of fourteen volunteers through self-reported questionnaires over periods of between 1-3 months, and explored binary stress detection based on HR and HRV using Machine Learning Methods. We observe promising preliminary results given that the dataset is collected in the challenging environments of everyday settings. The binary stress detector is fairly accurate and can detect stressful vs non-stressful samples with a macro-F1 score of up to %76. Our study lays the groundwork for more sophisticated labeling strategies that generate context-aware, personalized models that will empower health professionals to provide personalized interventions.
The learning of a new language remains to this date a cognitive task that requires considerable diligence and willpower, recent advances and tools notwithstanding. In this paper, we propose Broccoli, a new paradigm aimed at reducing the required effo rt by seamlessly embedding vocabulary learning into users everyday information diets. This is achieved by inconspicuously switching chosen words encountered by the user for their translation in the target language. Thus, by seeing words in context, the user can assimilate new vocabulary without much conscious effort. We validate our approach in a careful user study, finding that the efficacy of the lightweight Broccoli approach is competitive with traditional, memorization-based vocabulary learning. The low cognitive overhead is manifested in a pronounced decrease in learners usage of mnemonic learning strategies, as compared to traditional learning. Finally, we establish that language patterns in typical information diets are compatible with spaced-repetition strategies, thus enabling an efficient use of the Broccoli paradigm. Overall, our work establishes the feasibility of a novel and powerful install-and-forget approach for embedded language acquisition.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا