No Arabic abstract
Fashion is the way we present ourselves to the world and has become one of the worlds largest industries. Fashion, mainly conveyed by vision, has thus attracted much attention from computer vision researchers in recent years. Given the rapid development, this paper provides a comprehensive survey of more than 200 major fashion-related works covering four main aspects for enabling intelligent fashion: (1) Fashion detection includes landmark detection, fashion parsing, and item retrieval, (2) Fashion analysis contains attribute recognition, style learning, and popularity prediction, (3) Fashion synthesis involves style transfer, pose transformation, and physical simulation, and (4) Fashion recommendation comprises fashion compatibility, outfit matching, and hairstyle suggestion. For each task, the benchmark datasets and the evaluation protocols are summarized. Furthermore, we highlight promising directions for future research.
Polar ice cores play a central role in studies of the earths climate system through natural archives. A pressing issue is the analysis of the oldest, highly thinned ice core sections, where the identification of paleoclimate signals is particularly challenging. For this, state-of-the-art imaging by laser-ablation inductively-coupled plasma mass spectrometry (LA-ICP-MS) has the potential to be revolutionary due to its combination of micron-scale 2D chemical information with visual features. However, the quantitative study of record preservation in chemical images raises new questions that call for the expertise of the computer vision community. To illustrate this new inter-disciplinary frontier, we describe a selected set of key questions. One critical task is to assess the paleoclimate significance of single line profiles along the main core axis, which we show is a scale-dependent problem for which advanced image analysis methods are critical. Another important issue is the evaluation of post-depositional layer changes, for which the chemical images provide rich information. Accordingly, the time is ripe to begin an intensified exchange among the two scientific communities of computer vision and ice core science. The collaborative building of a new framework for investigating high-resolution chemical images with automated image analysis techniques will also benefit the already wide-spread application of LA-ICP-MS chemical imaging in the geosciences.
Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.
Computer Vision, either alone or combined with other technologies such as radar or Lidar, is one of the key technologies used in Advanced Driver Assistance Systems (ADAS). Its role understanding and analysing the driving scene is of great importance as it can be noted by the number of ADAS applications that use this technology. However, porting a vision algorithm to an embedded automotive system is still very challenging, as there must be a trade-off between several design requisites. Furthermore, there is not a standard implementation platform, so different alternatives have been proposed by both the scientific community and the industry. This paper aims to review the requisites and the different embedded implementation platforms that can be used for Computer Vision-based ADAS, with a critical analysis and an outlook to future trends.
The digital Michelangelo project was a seminal computer vision project in the early 2000s that pushed the capabilities of acquisition systems and involved multiple people from diverse fields, many of whom are now leaders in industry and academia. Reviewing this project with modern eyes provides us with the opportunity to reflect on several issues, relevant now as then to the field of computer vision and research in general, that go beyond the technical aspects of the work. This article was written in the context of a reading group competition at the week-long International Computer Vision Summer School 2017 (ICVSS) on Sicily, Italy. To deepen the participants understanding of computer vision and to foster a sense of community, various reading groups were tasked to highlight important lessons which may be learned from provided literature, going beyond the contents of the paper. This report is the winning entry of this guided discourse (Fig. 1). The authors closely examined the origins, fruits and most importantly lessons about research in general which may be distilled from the digital Michelangelo project. Discussions leading to this report were held within the group as well as with Hao Li, the group mentor.
The representation of images in the brain is known to be sparse. That is, as neural activity is recorded in a visual area ---for instance the primary visual cortex of primates--- only a few neurons are active at a given time with respect to the whole population. It is believed that such a property reflects the efficient match of the representation with the statistics of natural scenes. Applying such a paradigm to computer vision therefore seems a promising approach towards more biomimetic algorithms. Herein, we will describe a biologically-inspired approach to this problem. First, we will describe an unsupervised learning paradigm which is particularly adapted to the efficient coding of image patches. Then, we will outline a complete multi-scale framework ---SparseLets--- implementing a biologically inspired sparse representation of natural images. Finally, we will propose novel methods for integrating prior information into these algorithms and provide some preliminary experimental results. We will conclude by giving some perspective on applying such algorithms to computer vision. More specifically, we will propose that bio-inspired approaches may be applied to computer vision using predictive coding schemes, sparse models being one simple and efficient instance of such schemes.