ترغب بنشر مسار تعليمي؟ اضغط هنا

Confusing Deep Convolution Networks by Relabelling

163   0   0.0 ( 0 )
 نشر من قبل Leigh Robinson
 تاريخ النشر 2015
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Deep convolutional neural networks have become the gold standard for image recognition tasks, demonstrating many current state-of-the-art results and even achieving near-human level performance on some tasks. Despite this fact it has been shown that their strong generalisation qualities can be fooled to misclassify previously correctly classified natural images and give erroneous high confidence classifications to nonsense synthetic images. In this paper we extend that work, by presenting a straightforward way to perturb an image in such a way as to cause it to acquire any other label from within the dataset while leaving this perturbed image visually indistinguishable from the original.

قيم البحث

اقرأ أيضاً

Deep learning has outperformed other machine learning algorithms in a variety of tasks, and as a result, it has become more and more popular and used. However, as other machine learning algorithms, deep learning, and convolutional neural networks (CN Ns) in particular, perform worse when the data sets present label noise. Therefore, it is important to develop algorithms that help the training of deep networks and their generalization to noise-free test sets. In this paper, we propose a robust training strategy against label noise, called RAFNI, that can be used with any CNN. This algorithm filters and relabels instances of the training set based on the predictions and their probabilities made by the backbone neural network during the training process. That way, this algorithm improves the generalization ability of the CNN on its own. RAFNI consists of three mechanisms: two mechanisms that filter instances and one mechanism that relabels instances. In addition, it does not suppose that the noise rate is known nor does it need to be estimated. We evaluated our algorithm using different data sets of several sizes and characteristics. We also compared it with state-of-the-art models using the CIFAR10 and CIFAR100 benchmarks under different types and rates of label noise and found that RAFNI achieves better results in most cases.
324 - Audrey Chung , Paul Fieguth , 2019
Evolutionary deep intelligence has recently shown great promise for producing small, powerful deep neural network models via the synthesis of increasingly efficient architectures over successive generations. Despite recent research showing the effica cy of multi-parent evolutionary synthesis, little has been done to directly assess architectural similarity between networks during the synthesis process for improved parent network selection. In this work, we present a preliminary study into quantifying architectural similarity via the percentage overlap of architectural clusters. Results show that networks synthesized using architectural alignment (via gene tagging) maintain higher architectural similarities within each generation, potentially restricting the search space of highly efficient network architectures.
A recent paper by Gatys et al. describes a method for rendering an image in the style of another image. First, they use convolutional neural network features to build a statistical model for the style of an image. Then they create a new image with th e content of one image but the style statistics of another image. Here, we extend this method to render a movie in a given artistic style. The naive solution that independently renders each frame produces poor results because the features of the style move substantially from one frame to the next. The other naive method that initializes the optimization for the next frame using the rendered version of the previous frame also produces poor results because the features of the texture stay fixed relative to the frame of the movie instead of moving with objects in the scene. The main contribution of this paper is to use optical flow to initialize the style transfer optimization so that the texture features move with the objects in the video. Finally, we suggest a method to incorporate optical flow explicitly into the cost function.
Deep convolutional neural networks (ConvNets) of 3-dimensional kernels allow joint modeling of spatiotemporal features. These networks have improved performance of video and volumetric image analysis, but have been limited in size due to the low memo ry ceiling of GPU hardware. Existing CPU implementations overcome this constraint but are impractically slow. Here we extend and optimize the faster Winograd-class of convolutional algorithms to the $N$-dimensional case and specifically for CPU hardware. First, we remove the need to manually hand-craft algorithms by exploiting the relaxed constraints and cheap sparse access of CPU memory. Second, we maximize CPU utilization and multicore scalability by transforming data matrices to be cache-aware, integer multiples of AVX vector widths. Treating 2-dimensional ConvNets as a special (and the least beneficial) case of our approach, we demonstrate a 5 to 25-fold improvement in throughput compared to previous state-of-the-art.
Convolutional neural networks (CNNs) have made great breakthroughs in 2D computer vision. However, the irregular structure of meshes makes it hard to exploit the power of CNNs directly. A subdivision surface provides a hierarchical multi-resolution s tructure, and each face in a closed 2-manifold triangle mesh is exactly adjacent to three faces. Motivated by these two properties, this paper introduces a novel and flexible CNN framework, named SubdivNet, for 3D triangle meshes with Loop subdivision sequence connectivity. Making an analogy between mesh faces and pixels in a 2D image allows us to present a mesh convolution operator to aggregate local features from adjacent faces. By exploiting face neighborhoods, this convolution can support standard 2D convolutional network concepts, e.g. variable kernel size, stride, and dilation. Based on the multi-resolution hierarchy, we propose a spatial uniform pooling layer which merges four faces into one and an upsampling method which splits one face into four. As a result, many popular 2D CNN architectures can be readily adapted to processing 3D meshes. Meshes with arbitrary connectivity can be remeshed to hold Loop subdivision sequence connectivity via self-parameterization, making SubdivNet a general approach. Experiments on mesh classification, segmentation, correspondence, and retrieval from the real-world demonstrate the effectiveness and efficiency of SubdivNet.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا