Noisy data clusters are hollow


Abstract in English

A new vision in multidimensional statistics is proposed impacting severalareas of application. In these applications, a set of noisy measurementscharacterizing the repeatable response of a process is known as a realizationand can be seen as a single point in $mathbb{R}^N$. The projections of thispoint on the N axes correspond to the N measurements. The contemporary visionof a diffuse cloud of realizations distributed in $mathbb{R}^N$ is replaced bya cloud in the shape of a shell surrounding a topological manifold. Thismanifold corresponds to the processs stabilized-response domain observedwithout the measurement noise. The measurement noise, which accumulates overseveral dimensions, distances each realization from the manifold. Theprobability density function (PDF) of the realization-to-manifold distancecreates the shell. Considering the central limit theorem as the number ofdimensions increases, the PDF tends toward the normal distribution N($mu$,$sigma$^2) where $mu$ fixes the center shell location and $sigma$fixes the shell thickness. In vision, the likelihood of a realization is afunction of the realization-to-shell distance rather than therealization-to-manifold distance. The demonstration begins with the work ofClaude Shannon followed by the introduction of the shell manifold and ends withpractical applications to monitoring equipment.

Download