ﻻ يوجد ملخص باللغة العربية
This paper presents a computational model of concept learning using Bayesian inference for a grammatically structured hypothesis space, and test the model on multisensory (visual and haptics) recognition of 3D objects. The study is performed on a set of artificially generated 3D objects known as fribbles, which are complex, multipart objects with categorical structures. The goal of this work is to develop a working multisensory representational model that integrates major themes on concepts and concepts learning from the cognitive science literature. The model combines the representational power of a probabilistic generative grammar with the inferential power of Bayesian induction.
In this paper, we introduce a new problem, named audio-visual video parsing, which aims to parse a video into temporal event segments and label them as either audible, visible, or both. Such a problem is essential for a complete understanding of the
Learning to understand and infer object functionalities is an important step towards robust visual intelligence. Significant research efforts have recently focused on segmenting the object parts that enable specific types of human-object interaction,
The study of object representations in computer vision has primarily focused on developing representations that are useful for image classification, object detection, or semantic segmentation as downstream tasks. In this work we aim to learn object r
Autonomous vehicles and mobile robotic systems are typically equipped with multiple sensors to provide redundancy. By integrating the observations from different sensors, these mobile agents are able to perceive the environment and estimate system st
This paper introduces a spiking hierarchical model for object recognition which utilizes the precise timing information inherently present in the output of biologically inspired asynchronous Address Event Representation (AER) vision sensors. The asyn