Do you want to publish a course? Click here

The time-adaptive statistical testing for random number generators

86   0   0.0 ( 0 )
 Added by Boris Ryabko
 Publication date 2020
and research's language is English
 Authors Boris Ryabko




Ask ChatGPT about the research

The problem of constructing effective statistical tests for random number generators (RNG) is considered. Currently, there are hundreds of RNG statistical tests that are often combined into so-called batteries, each containing from a dozen to more than one hundred tests. When a battery test is used, it is applied to a sequence generated by the RNG, and the calculation time is determined by the length of the sequence and the number of tests. Generally speaking, the longer the sequence, the smaller deviations from randomness can be found by a specific test. So, when a battery is applied, on the one hand, the better tests are in the battery, the more chances to reject a bad RNG. On the other hand, the larger the battery, the less time can be spent on each test and, therefore, the shorter the test sequence. In turn, this reduces the ability to find small deviations from randomness. To reduce this trade-off, we propose an adaptive way to use batteries (and other sets) of tests, which requires less time but, in a certain sense, preserves the power of the original battery. We call this method time-adaptive battery of tests.



rate research

Read More

58 - Boris Ryabko 2019
The problem of constructing effective statistical tests for random number generators (RNG) is considered. Currently, statistical tests for RNGs are a mandatory part of cryptographic information protection systems, but their effectiveness is mainly estimated based on experiments with various RNGs. We find an asymptotic estimate for the p-value of an optimal test in the case where the alternative hypothesis is a known stationary ergodic source, and then describe a family of tests each of which has the same asymptotic estimate of the p-value for any (unknown) stationary ergodic source.
Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as realizations of Variable Length Markov Chains (VLMC) and we use the context trees as a signature of each protein family. Our approach is based on a Kolmogorov--Smirnov-type goodness-of-fit test proposed by Balding et al. [Limit theorems for sequences of random trees (2008), DOI: 10.1007/s11749-008-0092-z]. The test statistic is a supremum over the space of trees of a function of the two samples; its computation grows, in principle, exponentially fast with the maximal number of nodes of the potential trees. We show how to transform this problem into a max-flow over a related graph which can be solved using a Ford--Fulkerson algorithm in polynomial time on that number. We apply the test to 10 randomly chosen protein domain families from the seed of Pfam-A database (high quality, manually curated families). The test shows that the distributions of context trees coming from different families are significantly different. We emphasize that this is a novel mathematical approach to validate the automatic clustering of sequences in any context. We also study the performance of the test via simulations on Galton--Watson related processes.
124 - Boris Ryabko 2021
Currently, statistical tests for random number generators (RNGs) are widely used in practice, and some of them are even included in information security standards. But despite the popularity of RNGs, consistent tests are known only for stationary ergodic deviations of randomness (a test is consistent if it detects any deviations from a given class when the sample size goes to $ infty $). However, the model of a stationary ergodic source is too narrow for some RNGs, in particular, for generators based on physical effects. In this article, we propose computable consistent tests for some classes of deviations more general than stationary ergodic and describe some general properties of statistical tests. The proposed approach and the resulting test are based on the ideas and methods of information theory.
We generalize standard credal set models for imprecise probabilities to include higher order credal sets -- confidences about confidences. In doing so, we specify how an agents higher order confidences (credal sets) update upon observing an event. Our model begins to address standard issues with imprecise probability models, like Dilation and Belief Inertia. We conjecture that when higher order credal sets contain all possible probability functions, then in the limiting case the highest order confidences converge to form a uniform distribution over the first order credal set, where we define uniformity in terms of the statistical distance metric (total variation distance). Finite simulation supports the conjecture. We further suggest that this convergence presents the total-variation-uniform distribution as a natural, privileged prior for statistical hypothesis testing.
121 - C.T.J. Dodson 2009
The information geometry of the 2-manifold of gamma probability density functions provides a framework in which pseudorandom number generators may be evaluated using a neighbourhood of the curve of exponential density functions. The process is illustrated using the pseudorandom number generator in Mathematica. This methodology may be useful to add to the current family of test procedures in real applications to finite sampling data.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا