Sequential online subsampling for thinning experimental designs


Abstract in English

We consider a design problem where experimental conditions (design points $X_i$) are presented in the form of a sequence of i.i.d. random variables, generated with an unknown probability measure $mu$, and only a given proportion $alphain(0,1)$ can be selected. The objective is to select good candidates $X_i$ on the fly and maximize a concave function $Phi$ of the corresponding information matrix. The optimal solution corresponds to the construction of an optimal bounded design measure $xi_alpha^*leq mu/alpha$, with the difficulty that $mu$ is unknown and $xi_alpha^*$ must be constructed online. The construction proposed relies on the definition of a threshold $tau$ on the directional derivative of $Phi$ at the current information matrix, the value of $tau$ being fixed by a certain quantile of the distribution of this directional derivative. Combination with recursive quantile estimation yields a nonlinear two-time-scale stochastic approximation method. It can be applied to very long design sequences since only the current information matrix and estimated quantile need to be stored. Convergence to an optimum design is proved. Various illustrative examples are presented.

Download