أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل M. Andrecut

Diffusion Self-Organizing Map on the Hypersphere

103 - M. Andrecut 2021

We discuss a diffusion based implementation of the self-organizing map on the unit hypersphere. We show that this approach can be efficiently implemented using just linear algebra methods, we give a python numpy implementation, and we illustrate the approach using the well known MNIST dataset.

الحوسبة العصبية والتطورية التعلم الآلي

K-Means Kernel Classifier

69 - M. Andrecut 2020

We combine K-means clustering with the least-squares kernel classification method. K-means clustering is used to extract a set of representative vectors for each class. The least-squares kernel method uses these representative vectors as a training s et for the classification task. We show that this combination of unsupervised and supervised learning algorithms performs very well, and we illustrate this approach using the MNIST dataset

التعلم الآلي تحليل البيانات والإحصاءات والاحتمال

High-Dimensional Vector Semantics

23 - M. Andrecut 2018

In this paper we explore the vector semantics problem from the perspective of almost orthogonal property of high-dimensional random vectors. We show that this intriguing property can be used to memorize random vectors by simply adding them, and we pr ovide an efficient probabilistic solution to the set membership problem. Also, we discuss several applications to word context vector embeddings, document sentences similarity, and spam filtering.

الحساب واللغة الذكاء الاصطناعي التعلم الآلي

On the inherent competition between valid and spurious inductive inferences in Boolean data

50 - M. Andrecut 2018

Inductive inference is the process of extracting general rules from specific observations. This problem also arises in the analysis of biological networks, such as genetic regulatory networks, where the interactions are complex and the observations a re incomplete. A typical task in these problems is to extract general interaction rules as combinations of Boolean covariates, that explain a measured response variable. The inductive inference process can be considered as an incompletely specified Boolean function synthesis problem. This incompleteness of the problem will also generate spurious inferences, which are a serious threat to valid inductive inference rules. Using random Boolean data as a null model, here we attempt to measure the competition between valid and spurious inductive inference rules from a given data set. We formulate two greedy search algorithms, which synthesize a given Boolean response variable in a sparse disjunct normal form, and respectively a sparse generalized algebraic normal form of the variables from the observation data, and we evaluate numerically their performance.

تحليل البيانات والإحصاءات والاحتمال الذكاء الاصطناعي المنطق في علوم الحاسوب

Reservoir Computing on the Hypersphere

59 - M. Andrecut 2017

Reservoir Computing (RC) refers to a Recurrent Neural Networks (RNNs) framework, frequently used for sequence learning and time series prediction. The RC system consists of a random fixed-weight RNN (the input-hidden reservoir layer) and a classifier (the hidden-output readout layer). Here we focus on the sequence learning problem, and we explore a different approach to RC. More specifically, we remove the non-linear neural activation function, and we consider an orthogonal reservoir acting on normalized states on the unit hypersphere. Surprisingly, our numerical results show that the systems memory capacity exceeds the dimensionality of the reservoir, which is the upper bound for the typical RC approach based on Echo State Networks (ESNs). We also show how the proposed system can be applied to symmetric cryptography problems, and we include a numerical implementation.

التعلم الآلي تحليل البيانات والإحصاءات والاحتمال التعلم الالي

Systemic Risk, Maximum Entropy and Interbank Contagion

108 - M. Andrecut 2017

We discuss the systemic risk implied by the interbank exposures reconstructed with the maximum entropy method. The maximum entropy method severely underestimates the risk of interbank contagion by assuming a fully connected network, while in reality the structure of the interbank network is sparsely connected. Here, we formulate an algorithm for sparse network reconstruction, and we show numerically that it provides a more reliable estimation of the systemic risk.

الإحصاء وإدارة المخاطر

Local Operators in Kinetic Wealth Distribution

101 - M. Andrecut 2016

The statistical mechanics approach to wealth distribution is based on the conservative kinetic multi-agent model for money exchange, where the local interaction rule between the agents is analogous to the elastic particle scattering process. Here, we discuss the role of a class of conservative local operators, and we show that, depending on the values of their parameters, they can be used to generate all the relevant distributions. We also show numerically that in order to generate the power-law tail an heterogeneous risk aversion model is required. By changing the parameters of these operators one can also fine tune the resulting distributions in order to provide support for the emergence of a more egalitarian wealth distribution.

المالية العامة

Fast GPU Implementation of Sparse Signal Recovery from Random Projections

106 - M. Andrecut 2009

We consider the problem of sparse signal recovery from a small number of random projections (measurements). This is a well known NP-hard to solve combinatorial optimization problem. A frequently used approach is based on greedy iterative procedures, such as the Matching Pursuit (MP) algorithm. Here, we discuss a fast GPU implementation of the MP algorithm, based on the recently released NVIDIA CUDA API and CUBLAS library. The results show that the GPU version is substantially faster (up to 31 times) than the highly optimized CPU version based on CBLAS (GNU Scientific Library).

الأساليب الكمية تحليل البيانات والإحصاءات والاحتمال

Parallel GPU Implementation of Iterative PCA Algorithms

366 - M. Andrecut 2008

Principal component analysis (PCA) is a key statistical technique for multivariate data analysis. For large data sets the common approach to PCA computation is based on the standard NIPALS-PCA algorithm, which unfortunately suffers from loss of ortho gonality, and therefore its applicability is usually limited to the estimation of the first few components. Here we present an algorithm based on Gram-Schmidt orthogonalization (called GS-PCA), which eliminates this shortcoming of NIPALS-PCA. Also, we discuss the GPU (Graphics Processing Unit) parallel implementation of both NIPALS-PCA and GS-PCA algorithms. The numerical results show that the GPU parallel optimize

الأساليب الكمية البرمجيات الرياضية الفيزياء الحسابية

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد