No Arabic abstract
The subject of this paper are operators represented on Fock spaces whose behavior on one level depends only on two of its neighbors. Our initial objective was to generalize (via a common framework) the results of arXiv:math/0702158, arXiv:0709.4334, arXiv:0812.0895, and arXiv:1003.2998, whose constructions exhibited this behavior. We extend a number of results from these papers to our more general setting. These include the quadratic relation satisfied by the free cumulant generating function (actually by a variant of it), the resolvent form of the generating function for the Wick polynomials, and classification results for the case when the vacuum state on the operator algebra is tracial. We are able to handle the generating functions in infinitely many variables by considering their matrix-valu
A triplet comparison oracle on a set $S$ takes an object $x in S$ and for any pair ${y, z} subset S setminus {x}$ declares which of $y$ and $z$ is more similar to $x$. Partitioned Local Depth (PaLD) supplies a principled non-parametric partitioning of $S$ under such triplet comparisons but needs $O(n^2 log{n})$ oracle calls and $O(n^3)$ post-processing steps. We introduce Partitioned Nearest Neighbors Local Depth (PaNNLD), a computationally tractable variant of PaLD leveraging the $K$-nearest neighbors digraph on $S$. PaNNLD needs only $O(n K log{n})$ oracle calls, by replacing an oracle call by a coin flip when neither $y$ nor $z$ is adjacent to $x$ in the undirected version of the $K$-nearest neighbors digraph. By averaging over randomizations, PaNNLD subsequently requires (at best) only $O(n K^2)$ post-processing steps. Concentration of measure shows that the probability of randomization-induced error $delta$ in PaNNLD is no more than $2 e^{-delta^2 K^2}$.
It is well known that for linear Gaussian channels, a nearest neighbor decoding rule, which seeks the minimum Euclidean distance between a codeword and the received channel output vector, is the maximum likelihood solution and hence capacity-achieving. Nearest neighbor decoding remains a convenient and yet mismatched solution for general channels, and the key message of this paper is that the performance of the nearest neighbor decoding can be improved by generalizing its decoding metric to incorporate channel state dependent output processing and codeword scaling. Using generalized mutual information, which is a lower bound to the mismatched capacity under independent and identically distributed codebook ensemble, as the performance measure, this paper establishes the optimal generalized nearest neighbor decoding rule, under Gaussian channel input. Several suboptimal but reduced-complexity generalized nearest neighbor decoding rules are also derived and compared with existing solutions. The results are illustrated through several case studies for channels with nonlinear effects, and fading channels with receiver channel state information or with pilot-assisted training.
Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a foundation of many multimedia retrieval systems. Because it offers low responses times, Product Quantization (PQ) is a popular solution. PQ compresses high-dimensional vectors into short codes using several sub-quantizers, which enables in-RAM storage of large databases. This allows fast answers to NN queries, without accessing the SSD or HDD. The key feature of PQ is that it can compute distances between short codes and high-dimensional vectors using cache-resident lookup tables. The efficiency of this technique, named Asymmetric Distance Computation (ADC), remains limited because it performs many cache accesses. In this paper, we introduce Quick ADC, a novel technique that achieves a 3 to 6 times speedup over ADC by exploiting Single Instruction Multiple Data (SIMD) units available in current CPUs. Efficiently exploiting SIMD requires algorithmic changes to the ADC procedure. Namely, Quick ADC relies on two key modifications of ADC: (i) the use 4-bit sub-quantizers instead of the standard 8-bit sub-quantizers and (ii) the quantization of floating-point distances. This allows Quick ADC to exceed the performance of state-of-the-art systems, e.g., it achieves a Recall@100 of 0.94 in 3.4 ms on 1 billion SIFT descriptors (128-bit codes).
The noncommutative Gurarij space $mathbb{mathbb{mathbb{NG}}}$, initially defined by Oikhberg, is a canonical object in the theory of operator spaces. As the Fra{i}ss{e} limit of the class of finite-dimensional nuclear operator spaces, it can be seen as the noncommutative analogue of the classical Gurarij Banach space. In this paper, we prove that the automorphism group of $mathbb{mathbb{NG}}$ is extremely amenable, i.e. any of its actions on compact spaces has a fixed point. The proof relies on the Dual Ramsey Theorem, and a version of the Kechris--Pestov--Todorcevic correspondence in the setting of operator spaces. Recent work of Davidson and Kennedy, building on previous work of Arveson, Effros, Farenick, Webster, and Winkler, among others, shows that nuclear operator systems can be seen as the noncommutative analogue of Choquet simplices. The analogue of the Poulsen simplex in this context is the matrix state space $mathbb{NP}$ of the Fra{i}ss{e} limit $A(mathbb{NP})$ of the class of finite-dimensional nuclear operator systems. We show that the canonical action of the automorphism group of $mathbb{NP}$ on the compact set $mathbb{NP}_1$ of unital linear functionals on $A(mathbb{NP})$ is minimal and it factors onto any minimal action, whence providing a description of the universal minimal flow of textrm{Aut}$left( mathbb{NP}% right) $.
Though nearest neighbor Machine Translation ($k$NN-MT) cite{khandelwal2020nearest} has proved to introduce significant performance boosts over standard neural MT systems, it is prohibitively slow since it uses the entire reference corpus as the datastore for the nearest neighbor search. This means each step for each beam in the beam search has to search over the entire reference corpus. $k$NN-MT is thus two-order slower than vanilla MT models, making it hard to be applied to real-world applications, especially online services. In this work, we propose Fast $k$NN-MT to address this issue. Fast $k$NN-MT constructs a significantly smaller datastore for the nearest neighbor search: for each word in a source sentence, Fast $k$NN-MT first selects its nearest token-level neighbors, which is limited to tokens that are the same as the query token. Then at each decoding step, in contrast to using the entire corpus as the datastore, the search space is limited to target tokens corresponding to the previously selected reference source tokens. This strategy avoids search through the whole datastore for nearest neighbors and drastically improves decoding efficiency. Without loss of performance, Fast $k$NN-MT is two-order faster than $k$NN-MT, and is only two times slower than the standard NMT model. Fast $k$NN-MT enables the practical use of $k$NN-MT systems in real-world MT applications.footnote{Code is available at url{https://github.com/ShannonAI/fast-knn-nmt.}}