No Arabic abstract
One way to interject knowledge into clinically impactful forecasting is to use data assimilation, a nonlinear regression that projects data onto a mechanistic physiologic model, instead of a set of functions, such as neural networks. Such regressions have an advantage of being useful with particularly sparse, non-stationary clinical data. However, physiological models are often nonlinear and can have many parameters, leading to potential problems with parameter identifiability, or the ability to find a unique set of parameters that minimize forecasting error. The identifiability problems can be minimized or eliminated by reducing the number of parameters estimated, but reducing the number of estimated parameters also reduces the flexibility of the model and hence increases forecasting error. We propose a method, the parameter Houlihan, that combines traditional machine learning techniques with data assimilation, to select the right set of model parameters to minimize forecasting error while reducing identifiability problems. The method worked well: the data assimilation-based glucose forecasts and estimates for our cohort using the Houlihan-selected parameter sets generally also minimize forecasting errors compared to other parameter selection methods such as by-hand parameter selection. Nevertheless, the forecast with the lowest forecast error does not always accurately represent physiology, but further advancements of the algorithm provide a path for improving physiologic fidelity as well. Our hope is that this methodology represents a first step toward combining machine learning with data assimilation and provides a lower-threshold entry point for using data assimilation with clinical data by helping select the right parameters to estimate.
Classical optimization techniques often formulate the feasibility of the problems as set, equality or inequality constraints. However, explicitly designing these constraints is indeed challenging for complex real-world applications and too strict constraints may even lead to intractable optimization problems. On the other hand, it is still hard to incorporate data-dependent information into conventional numerical iterations. To partially address the above limits and inspired by the leader-follower gaming perspective, this work first introduces a bilevel-type formulation to jointly investigate the feasibility and optimality of nonconvex and nonsmooth optimization problems. Then we develop an algorithmic framework to couple forward-backward proximal computations to optimize our established bilevel leader-follower model. We prove its convergence and estimate the convergence rate. Furthermore, a learning-based extension is developed, in which we establish an unrolling strategy to incorporate data-dependent network architectures into our iterations. Fortunately, it can be proved that by introducing some mild checking conditions, all our original convergence results can still be preserved for this learnable extension. As a nontrivial byproduct, we demonstrate how to apply this ensemble-like methodology to address different low-level vision tasks. Extensive experiments verify the theoretical results and show the advantages of our method against existing state-of-the-art approaches.
The aim of this paper is to investigate the use of an entropic projection method for the iterative regularization of linear ill-posed problems. We derive a closed form solution for the iterates and analyze their convergence behaviour both in a case of reconstructing general nonnegative unknowns as well as for the sake of recovering probability distributions. Moreover, we discuss several variants of the algorithm and relations to other methods in the literature. The effectiveness of the approach is studied numerically in several examples.
In multiple scientific and technological applications we face the problem of having low dimensional data to be justified by a linear model defined in a high dimensional parameter space. The difference in dimensionality makes the problem ill-defined: the model is consistent with the data for many values of its parameters. The objective is to find the probability distribution of parameter values consistent with the data, a problem that can be cast as the exploration of a high dimensional convex polytope. In this work we introduce a novel algorithm to solve this problem efficiently. It provides results that are statistically indistinguishable from currently used numerical techniques while its running time scales linearly with the system size. We show that the algorithm performs robustly in many abstract and practical applications. As working examples we simulate the effects of restricting reaction fluxes on the space of feasible phenotypes of a {em genome} scale E. Coli metabolic network and infer the traffic flow between origin and destination nodes in a real communication network.
In this paper, we clarify the relations between the existing sets of regularity conditions for convergence rates of nonparametric indirect regression (NPIR) and nonparametric instrumental variables (NPIV) regression models. We establish minimax risk lower bounds in mean integrated squared error loss for the NPIR and the NPIV models under two basic regularity conditions that allow for both mildly ill-posed and severely ill-posed cases. We show that both a simple projection estimator for the NPIR model, and a sieve minimum distance estimator for the NPIV model, can achieve the minimax risk lower bounds, and are rate-optimal uniformly over a large class of structure functions, allowing for mildly ill-posed and severely ill-posed cases.
The analysis of linear ill-posed problems often is carried out in function spaces using tools from functional analysis. However, the numerical solution of these problems typically is computed by first discretizing the problem and then applying tools from (finite-dimensional) linear algebra. The present paper explores the feasibility of applying the Chebfun package to solve ill-posed problems. This approach allows a user to work with functions instead of matrices. The solution process therefore is much closer to the analysis of ill-posed problems than standard linear algebra-based solution methods.