ترغب بنشر مسار تعليمي؟ اضغط هنا

Dependence of exponents on text length versus finite-size scaling for word-frequency distributions

116   0   0.0 ( 0 )
 نشر من قبل Alvaro Corral
 تاريخ النشر 2018
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Some authors have recently argued that a finite-size scaling law for the text-length dependence of word-frequency distributions cannot be conceptually valid. Here we give solid quantitative evidence for the validity of such scaling law, both using careful statistical tests and analytical arguments based on the generalized central-limit theorem applied to the moments of the distribution (and obtaining a novel derivation of Heaps law as a by-product). We also find that the picture of word-frequency distributions with power-law exponents that decrease with text length [Yan and Minnhagen, Physica A 444, 828 (2016)] does not stand with rigorous statistical analysis. Instead, we show that the distributions are perfectly described by power-law tails with stable exponents, whose values are close to 2, in agreement with the classical Zipfs law. Some misconceptions about scaling are also clarified.

قيم البحث

اقرأ أيضاً

We investigate the use of matrix product states (MPS) to approximate ground states of critical quantum spin chains with periodic boundary conditions (PBC). We identify two regimes in the (N,D) parameter plane, where N is the size of the spin chain an d D is the dimension of the MPS matrices. In the first regime MPS can be used to perform finite size scaling (FSS). In the complementary regime the MPS simulations show instead the clear signature of finite entanglement scaling (FES). In the thermodynamic limit (or large N limit), only MPS in the FSS regime maintain a finite overlap with the exact ground state. This observation has implications on how to correctly perform FSS with MPS, as well as on the performance of recent MPS algorithms for systems with PBC. It also gives clear evidence that critical models can actually be simulated very well with MPS by using the right scaling relations; in the appendix, we give an alternative derivation of the result of Pollmann et al. [Phys. Rev. Lett. 102, 255701 (2009)] relating the bond dimension of the MPS to an effective correlation length.
The in situ measurement of the particle size distribution (PSD) of a suspension of particles presents huge challenges. Various effects from the process could introduce noise to the data from which the PSD is estimated. This in turn could lead to the occurrence of artificial peaks in the estimated PSD. Limitations in the models used in the PSD estimation could also lead to the occurrence of these artificial peaks. This could pose a significant challenge to in situ monitoring of particulate processes, as there will be no independent estimate of the PSD to allow a discrimination of the artificial peaks to be carried out. Here, we present an algorithm which is capable of discriminating between artificial and true peaks in PSD estimates based on fusion of multiple data streams. In this case, chord length distribution and laser diffraction data have been used. The data fusion is done by means of multi-objective optimisation using the weighted sum approach. The algorithm is applied to two different particle suspensions. The estimated PSDs from the algorithm are compared with offline estimates of PSD from the Malvern Mastersizer and Morphologi G3. The results show that the algorithm is capable of eliminating an artificial peak in a PSD estimate when this artificial peak is sufficiently displaced from the true peak. However, when the artificial peak is too close to the true peak, it is only suppressed but not completely eliminated.
271 - Anna Carbone , Ken Kiyono 2016
The Detrending Moving Average (DMA) algorithm has been widely used in its several variants for characterizing long-range correlations of random signals and sets (one-dimensional sequences or high-dimensional arrays) either over time or space. In this paper, mainly based on analytical arguments, the scaling performances of the centered DMA, including higher-order ones, are investigated by means of a continuous time approximation and a frequency response approach. Our results are also confirmed by numerical tests. The study is carried out for higher-order DMA operating with moving average polynomials of different degree. In particular, detrending power degree, frequency response, asymptotic scaling, upper limit of the detectable scaling exponent and finite scale range behavior will be discussed.
We present an unbiased and robust analysis method for power-law blinking statistics in the photoluminescence of single nano-emitters, allowing us to extract both the bright- and dark-state power-law exponents from the emitters intensity autocorrelati on functions. As opposed to the widely-used threshold method, our technique therefore does not require discriminating the emission levels of bright and dark states in the experimental intensity timetraces. We rely on the simultaneous recording of 450 emission timetraces of single CdSe/CdS core/shell quantum dots at a frame rate of 250 Hz with single photon sensitivity. Under these conditions, our approach can determine ON and OFF power-law exponents with a precision of 3% from a comparison to numerical simulations, even for shot-noise-dominated emission signals with an average intensity below 1 photon per frame and per quantum dot. These capabilities pave the way for the unbiased, threshold-free determination of blinking power-law exponents at the micro-second timescale.
We explore the phase space spanned by the temperature and the chemical potential for 4-flavor lattice QCD using the Wilson-clover quark action. In order to determine the order of the phase transition, we apply finite size scaling analyses to gluonic and quark observables including plaquette, Polyakov loop and quark number density, and examine their susceptibility, skewness, kurtosis and Challa-Landau-Binder cumulant. Simulations were carried out on lattices of a temporal size fixed at $N_{text{t}}=4$ and spatial sizes chosen from $6^3$ up to $10^3$. Configurations were generated using the phase reweighting approach, while the value of the phase of the quark determinant were carefully monitored. The $mu$-parameter reweighting technique is employed to precisely locate the point of the phase transition. Among various approximation schemes for calculating the ratio of quark determinants needed for $mu$-reweighting, we found the Taylor expansion of the logarithm of the quark determinant to be the most reliable. Our finite-size analyses show that the transition is first order at $(beta, kappa, mu/T)=(1.58, 0.1385, 0.584pm 0.008)$ where $(m_pi/m_rho, T/m_rho)=(0.822, 0.154)$. It weakens considerably at $(beta, kappa, mu/T)=(1.60, 0.1371, 0.821pm 0.008)$ where $(m_pi/m_rho, T/m_rho)=(0.839, 0.150)$, and a crossover rather than a first order phase transition cannot be ruled out.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا