ترغب بنشر مسار تعليمي؟ اضغط هنا

There has been significant amount of research work on human activity classification relying either on Inertial Measurement Unit (IMU) data or data from static cameras providing a third-person view. Using only IMU data limits the variety and complexit y of the activities that can be detected. For instance, the sitting activity can be detected by IMU data, but it cannot be determined whether the subject has sat on a chair or a sofa, or where the subject is. To perform fine-grained activity classification from egocentric videos, and to distinguish between activities that cannot be differentiated by only IMU data, we present an autonomous and robust method using data from both ego-vision cameras and IMUs. In contrast to convolutional neural network-based approaches, we propose to employ capsule networks to obtain features from egocentric video data. Moreover, Convolutional Long Short Term Memory framework is employed both on egocentric videos and IMU data to capture temporal aspect of actions. We also propose a genetic algorithm-based approach to autonomously and systematically set various network parameters, rather than using manual settings. Experiments have been performed to perform 9- and 26-label activity classification, and the proposed method, using autonomously set network parameters, has provided very promising results, achieving overall accuracies of 86.6% and 77.2%, respectively. The proposed approach combining both modalities also provides increased accuracy compared to using only egovision data and only IMU data.
Neural networks are known to be vulnerable to carefully crafted adversarial examples, and these malicious samples often transfer, i.e., they maintain their effectiveness even against other models. With great efforts delved into the transferability of adversarial examples, surprisingly, less attention has been paid to its impact on real-world deep learning deployment. In this paper, we investigate the transferability of adversarial examples across a wide range of real-world computer vision tasks, including image classification, explicit content detection, optical character recognition (OCR), and object detection. It represents the cybercriminals situation where an ensemble of different detection mechanisms need to be evaded all at once. We propose practical attack that overcomes existing attacks limitation of requiring task-specific loss functions by targeting on the `dispersion of internal feature map. We report evaluation on four different computer vision tasks provided by Google Cloud Vision APIs to show how our approach outperforms existing attacks by degrading performance of multiple CV tasks by a large margin with only modest perturbations.
The choice of parameters, and the design of the network architecture are important factors affecting the performance of deep neural networks. Genetic Algorithms (GA) have been used before to determine parameters of a network. Yet, GAs perform a finit e search over a discrete set of pre-defined candidates, and cannot, in general, generate unseen configurations. In this paper, to move from exploration to exploitation, we propose a novel and systematic method that autonomously and simultaneously optimizes multiple parameters of any deep neural network by using a GA aided by a bi-generative adversarial network (Bi-GAN). The proposed Bi-GAN allows the autonomous exploitation and choice of the number of neurons, for fully-connected layers, and number of filters, for convolutional layers, from a large range of values. Our proposed Bi-GAN involves two generators, and two different models compete and improve each other progressively with a GAN-based strategy to optimize the networks during GA evolution. Our proposed approach can be used to autonomously refine the number of convolutional layers and dense layers, number and size of kernels, and the number of neurons for the dense layers; choose the type of the activation function; and decide whether to use dropout and batch normalization or not, to improve the accuracy of different deep neural network architectures. Without loss of generality, the proposed method has been tested with the ModelNet database, and compared with the 3D Shapenets and two GA-only methods. The results show that the presented approach can simultaneously and successfully optimize multiple neural network parameters, and achieve higher accuracy even with shallower networks.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا