No Arabic abstract
Point patterns are sets or multi-sets of unordered elements that can be found in numerous data sources. However, in data analysis tasks such as classification and novelty detection, appropriate statistical models for point pattern data have not received much attention. This paper proposes the modelling of point pattern data via random finite sets (RFS). In particular, we propose appropriate likelihood functions, and a maximum likelihood estimator for learning a tractable family of RFS models. In novelty detection, we propose novel ranking functions based on RFS models, which substantially improve performance.
Clustering is one of the most common unsupervised learning tasks in machine learning and data mining. Clustering algorithms have been used in a plethora of applications across several scientific fields. However, there has been limited research in the clustering of point patterns - sets or multi-sets of unordered elements - that are found in numerous applications and data sources. In this paper, we propose two approaches for clustering point patterns. The first is a non-parametric method based on novel distances for sets. The second is a model-based approach, formulated via random finite set theory, and solved by the Expectation-Maximization algorithm. Numerical experiments show that the proposed methods perform well on both simulated and real data.
We propose a new method for novelty detection that can tolerate high corruption of the training points, whereas previous works assumed either no or very low corruption. Our method trains a robust variational autoencoder (VAE), which aims to generate a model for the uncorrupted training points. To gain robustness to high corruption, we incorporate the following four changes to the common VAE: 1. Extracting crucial features of the latent code by a carefully designed dimension reduction component for distributions; 2. Modeling the latent distribution as a mixture of Gaussian low-rank inliers and full-rank outliers, where the testing only uses the inlier model; 3. Applying the Wasserstein-1 metric for regularization, instead of the Kullback-Leibler (KL) divergence; and 4. Using a least absolute deviation error for reconstruction. We establish both robustness to outliers and suitability to low-rank modeling of the Wasserstein metric as opposed to the KL divergence. We illustrate state-of-the-art results on standard benchmarks for novelty detection.
Novelty detection in discrete sequences is a challenging task, since deviations from the process generating the normal data are often small or intentionally hidden. Novelties can be detected by modeling normal sequences and measuring the deviations of a new sequence from the model predictions. However, in many applications data is generated by several distinct processes so that models trained on all the data tend to over-generalize and novelties remain undetected. We propose to approach this challenge through decomposition: by clustering the data we break down the problem, obtaining simpler modeling task in each cluster which can be modeled more accurately. However, this comes at a trade-off, since the amount of training data per cluster is reduced. This is a particular problem for discrete sequences where state-of-the-art models are data-hungry. The success of this approach thus depends on the quality of the clustering, i.e., whether the individual learning problems are sufficiently simpler than the joint problem. While clustering discrete sequences automatically is a challenging and domain-specific task, it is often easy for human domain experts, given the right tools. In this paper, we adapt a state-of-the-art visual analytics tool for discrete sequence clustering to obtain informed clusters from domain experts and use LSTMs to model each cluster individually. Our extensive empirical evaluation indicates that this informed clustering outperforms automatic ones and that our approach outperforms state-of-the-art novelty detection methods for discrete sequences in three real-world application scenarios. In particular, decomposition outperforms a global model despite less training data on each individual cluster.
Research on automated, image based identification of clothing categories and fashion landmarks has recently gained significant interest due to its potential impact on areas such as robotic clothing manipulation, automated clothes sorting and recycling, and online shopping. Several public and annotated fashion datasets have been created to facilitate research advances in this direction. In this work, we make the first step towards leveraging the data and techniques developed for fashion image analysis in vision-based robotic clothing manipulation tasks. We focus on techniques that can generalize from large-scale fashion datasets to less structured, small datasets collected in a robotic lab. Specifically, we propose training data augmentation methods such as elastic warping, and model adjustments such as rotation invariant convolutions to make the model generalize better. Our experiments demonstrate that our approach outperforms stateof-the art models with respect to clothing category classification and fashion landmark detection when tested on previously unseen datasets. Furthermore, we present experimental results on a new dataset composed of images where a robot holds different garments, collected in our lab.
Machine Learning has become very famous currently which assist in identifying the patterns from the raw data. Technological advancement has led to substantial improvement in Machine Learning which, thus helping to improve prediction. Current Machine Learning models are based on Classical Theory, which can be replaced by Quantum Theory to improve the effectiveness of the model. In the previous work, we developed binary classifier inspired by Quantum Detection Theory. In this extended abstract, our main goal is to develop multi-class classifier. We generally use the terminology multinomial classification or multi-class classification when we have a classification problem for classifying observations or instances into one of three or more classes.