Distributed dual vigilance fuzzy adaptive resonance theory learns online, retrieves arbitrarily-shaped clusters, and mitigates order dependence

53 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Leonardo Enzo Brito Da Silva

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Leonardo Enzo Brito da Silva - Islam Elnabarawy - Donald C. Wunsch II

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper presents a novel adaptive resonance theory (ART)-based modular architecture for unsupervised learning, namely the distributed dual vigilance fuzzy ART (DDVFA). DDVFA consists of a global ART system whose nodes are local fuzzy ART modules. It is equipped with the distinctive features of distributed higher-order activation and match functions, using dual vigilance parameters responsible for cluster similarity and data quantization. Together, these allow DDVFA to perform unsupervised modularization, create multi-prototype clustering representations, retrieve arbitrarily-shaped clusters, and control its compactness. Another important contribution is the reduction of order-dependence, an issue that affects any agglomerative clustering method. This paper demonstrates two approaches for mitigating order-dependence: preprocessing using visual assessment of cluster tendency (VAT) or postprocessing using a novel Merge ART module. The former is suitable for batch processing, whereas the latter can be used in online learning. Experimental results in the online learning mode carried out on 30 benchmark data sets show that DDVFA cascaded with Merge ART statistically outperformed the best other ART-based systems when samples were randomly presented. Conversely, they were found to be statistically equivalent in the offline mode when samples were pre-processed using VAT. Remarkably, performance comparisons to non-ART-based clustering algorithms show that DDVFA (which learns incrementally) was also statistically equivalent to the non-incremental (offline) methods of DBSCAN, single linkage hierarchical agglomerative clustering (HAC), and k-means, while retaining the appealing properties of ART. Links to the source code and data are provided. Considering the algorithms simplicity, online learning capability, and performance, it is an ideal choice for many agglomerative clustering applications.

قيم البحث

318 - Leonardo Enzo Brito da Silva , Islam Elnabarawy , Donald C. Wunsch II 2019

This survey samples from the ever-growing family of adaptive resonance theory (ART) neural network models used to perform the three primary machine learning modalities, namely, unsupervised, supervised and reinforcement learning. It comprises a repre sentative list from classic to modern ART models, thereby painting a general picture of the architectures developed by researchers over the past 30 years. The learning dynamics of these ART models are briefly described, and their distinctive characteristics such as code representation, long-term memory and corresponding geometric interpretation are discussed. Useful engineering properties of ART (speed, configurability, explainability, parallelization and hardware implementation) are examined along with current challenges. Finally, a compilation of online software libraries is provided. It is expected that this overview will be helpful to new and seasoned ART researchers.

الحوسبة العصبية والتطورية التعلم الآلي التعلم الالي

Variable Binding for Sparse Distributed Representations: Theory and Applications

68 - E. Paxon Frady , Denis Kleyko , Friedrich T. Sommer 2020

Symbolic reasoning and neural networks are often considered incompatible approaches. Connectionist models known as Vector Symbolic Architectures (VSAs) can potentially bridge this gap. However, classical VSAs and neural networks are still considered incompatible. VSAs encode symbols by dense pseudo-random vectors, where information is distributed throughout the entire neuron population. Neural networks encode features locally, often forming sparse vectors of neural activation. Following Rachkovskij (2001); Laiho et al. (2015), we explore symbolic reasoning with sparse distributed representations. The core operations in VSAs are dyadic operations between vectors to express variable binding and the representation of sets. Thus, algebraic manipulations enable VSAs to represent and process data structures in a vector space of fixed dimensionality. Using techniques from compressed sensing, we first show that variable binding between dense vectors in VSAs is mathematically equivalent to tensor product binding between sparse vectors, an operation which increases dimensionality. This result implies that dimensionality-preserving binding for general sparse vectors must include a reduction of the tensor matrix into a single sparse vector. Two options for sparsity-preserving variable binding are investigated. One binding method for general sparse vectors extends earlier proposals to reduce the tensor product into a vector, such as circular convolution. The other method is only defined for sparse block-codes, block-wise circular convolution. Our experiments reveal that variable binding for block-codes has ideal properties, whereas binding for general sparse vectors also works, but is lossy, similar to previous proposals. We demonstrate a VSA with sparse block-codes in example applications, cognitive reasoning and classification, and discuss its relevance for neuroscience and neural networks.

الحوسبة العصبية والتطورية التعلم الآلي

Distributed Adaptive Nearest Neighbor Classifier: Algorithm and Theory

98 - Ruiqi Liu , Ganggang Xu , Zuofeng Shang 2021

When data is of an extraordinarily large size or physically stored in different locations, the distributed nearest neighbor (NN) classifier is an attractive tool for classification. We propose a novel distributed adaptive NN classifier for which the number of nearest neighbors is a tuning parameter stochastically chosen by a data-driven criterion. An early stopping rule is proposed when searching for the optimal tuning parameter, which not only speeds up the computation but also improves the finite sample performance of the proposed Algorithm. Convergence rate of excess risk of the distributed adaptive NN classifier is investigated under various sub-sample size compositions. In particular, we show that when the sub-sample sizes are sufficiently large, the proposed classifier achieves the nearly optimal convergence rate. Effectiveness of the proposed approach is demonstrated through simulation studies as well as an empirical application to a real-world dataset.

التعلم الالي التعلم الآلي المنهجية

Message Passing Adaptive Resonance Theory for Online Active Semi-supervised Learning

96 - Taehyeong Kim , Injune Hwang , Hyundo Lee 2020

Active learning is widely used to reduce labeling effort and training time by repeatedly querying only the most beneficial samples from unlabeled data. In real-world problems where data cannot be stored indefinitely due to limited storage or privacy issues, the query selection and the model update should be performed as soon as a new data sample is observed. Various online active learning methods have been studied to deal with these challenges; however, there are difficulties in selecting representative query samples and updating the model efficiently without forgetting. In this study, we propose Message Passing Adaptive Resonance Theory (MPART) that learns the distribution and topology of input data online. Through message passing on the topological graph, MPART actively queries informative and representative samples, and continuously improves the classification performance using both labeled and unlabeled data. We evaluate our model in stream-based selective sampling scenarios with comparable query selection strategies, showing that MPART significantly outperforms competitive models.

التعلم الآلي الذكاء الاصطناعي

Learning Various Length Dependence by Dual Recurrent Neural Networks

75 - Chenpeng Zhang 2020

Recurrent neural networks (RNNs) are widely used as a memory model for sequence-related problems. Many variants of RNN have been proposed to solve the gradient problems of training RNNs and process long sequences. Although some classical models have been proposed, capturing long-term dependence while responding to short-term changes remains a challenge. To this problem, we propose a new model named Dual Recurrent Neural Networks (DuRNN). The DuRNN consists of two parts to learn the short-term dependence and progressively learn the long-term dependence. The first part is a recurrent neural network with constrained full recurrent connections to deal with short-term dependence in sequence and generate short-term memory. Another part is a recurrent neural network with independent recurrent connections which helps to learn long-term dependence and generate long-term memory. A selection mechanism is added between two parts to help the needed long-term information transfer to the independent neurons. Multiple modules can be stacked to form a multi-layer model for better performance. Our contributions are: 1) a new recurrent model developed based on the divide-and-conquer strategy to learn long and short-term dependence separately, and 2) a selection mechanism to enhance the separating and learning of different temporal scales of dependence. Both theoretical analysis and extensive experiments are conducted to validate the performance of our model, and we also conduct simple visualization experiments and ablation analyses for the model interpretability. Experimental results indicate that the proposed DuRNN model can handle not only very long sequences (over 5000 time steps), but also short sequences very well. Compared with many state-of-the-art RNN models, our model has demonstrated efficient and better performance.

الحوسبة العصبية والتطورية الحساب واللغة الرؤية الحاسوبية وتمييز الأنماط