ترغب بنشر مسار تعليمي؟ اضغط هنا

We introduce a deep, generative autoencoder capable of learning hierarchies of distributed representations from data. Successive deep stochastic hidden layers are equipped with autoregressive connections, which enable the model to be sampled from qui ckly and exactly via ancestral sampling. We derive an efficient approximate parameter estimation method based on the minimum description length (MDL) principle, which can be seen as maximising a variational lower bound on the log-likelihood, with a feedforward neural network implementing approximate inference. We demonstrate state-of-the-art generative performance on a number of classic data sets: several UCI data sets, MNIST and Atari 2600 games.
63 - Karol Gregor , Yann LeCun 2011
We give an algorithm that learns a representation of data through compression. The algorithm 1) predicts bits sequentially from those previously seen and 2) has a structure and a number of computations similar to an autoencoder. The likelihood under the model can be calculated exactly, and arithmetic coding can be used directly for compression. When training on digits the algorithm learns filters similar to those of restricted boltzman machines and denoising autoencoders. Independent samples can be drawn from the model by a single sweep through the pixels. The algorithm has a good compression performance when compared to other methods that work under random ordering of pixels.
153 - Karol Gregor , Yann LeCun 2011
We propose a simple and efficient algorithm for learning sparse invariant representations from unlabeled data with fast inference. When trained on short movies sequences, the learned features are selective to a range of orientations and spatial frequ encies, but robust to a wide range of positions, similar to complex cells in the primary visual cortex. We give a hierarchical version of the algorithm, and give guarantees of fast convergence under certain conditions.
76 - Karo Gregor , Yann LeCun 2010
We introduce a new neural architecture and an unsupervised algorithm for learning invariant representations from temporal sequence of images. The system uses two groups of complex cells whose outputs are combined multiplicatively: one that represents the content of the image, constrained to be constant over several consecutive frames, and one that represents the precise location of features, which is allowed to vary over time but constrained to be sparse. The architecture uses an encoder to extract features, and a decoder to reconstruct the input from the features. The method was applied to patches extracted from consecutive movie frames and produces orientation and frequency selective units analogous to the complex cells in V1. An extension of the method is proposed to train a network composed of units with local receptive field spread over a large image of arbitrary size. A layer of complex cells, subject to sparsity constraints, pool feature units over overlapping local neighborhoods, which causes the feature units to organize themselves into pinwheel patterns of orientation-selective receptive fields, similar to those observed in the mammalian visual cortex. A feed-forward encoder efficiently computes the feature representation of full images.
We apply deep belief networks of restricted Boltzmann machines to bags of words of sift features obtained from databases of 13 Scenes, 15 Scenes and Caltech 256 and study experimentally their behavior and performance. We find that the final performan ce in the supervised phase is reached much faster if the system is pre-trained. Pre-training the system on a larger dataset keeping the supervised dataset fixed improves the performance (for the 13 Scenes case). After the unsupervised pre-training, neurons arise that form approximate explicit representations for several categories (meaning they are mostly active for this category). The last three facts suggest that unsupervised training really discovers structure in these data. Pre-training can be done on a completely different dataset (we use Corel dataset) and we find that the supervised phase performs just as good (on the 15 Scenes dataset). This leads us to conjecture that one can pre-train the system once (e.g. in a factory) and subsequently apply it to many supervised problems which then learn much faster. The best performance is obtained with single hidden layer system suggesting that the histogram of sift features doesnt have much high level structure. The overall performance is almost equal, but slightly worse then that of the support vector machine and the spatial pyramidal matching.
We study effects of nonmagnetic impurities in a spin-1/2 frustrated triangular antiferromagnet with the aim of understanding the observed broadening of $^{13}$C NMR lines in the organic spin liquid material $kappa$-(ET)$_2$Cu$_2$(CN)$_3$. For high te mperatures down to $J/3$, we calculate local susceptibility near a nonmagnetic impurity and near a grain boundary for the nearest neighbor Heisenberg model in high temperature series expansion. We find that the local susceptibility decays to the uniform one in few lattice spacings, and for a low density of impurities we would not be able to explain the line broadening present in the experiments already at elevated temperatures. At low temperatures, we assume a gapless spin liquid with a Fermi surface of spinons. We calculate the local susceptibility in the mean field and also go beyond the mean field by Gutzwiller projection. The zero temperature local susceptibility decays as a power law and oscillates at $2 k_F$. As in the high temperature analysis we find that a low density of impurities is not able to explain the observed broadening of the lines. We are thus led to conclude that there is more disorder in the system. We find that a large density of point-like disorder gives broadening that is consistent with the experiment down to about 5K, but that below this temperature additional mechanism is likely needed.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا