ترغب بنشر مسار تعليمي؟ اضغط هنا

Extracting falsifiable predictions from sloppy models

57   0   0.0 ( 0 )
 نشر من قبل Ryan Gutenkunst
 تاريخ النشر 2007
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Successful predictions are among the most compelling validations of any model. Extracting falsifiable predictions from nonlinear multiparameter models is complicated by the fact that such models are commonly sloppy, possessing sensitivities to different parameter combinations that range over many decades. Here we discuss how sloppiness affects the sorts of data that best constrain model predictions, makes linear uncertainty approximations dangerous, and introduces computational difficulties in Monte-Carlo uncertainty analysis. We also present a useful test problem and suggest refinements to the standards by which models are communicated.

قيم البحث

اقرأ أيضاً

We present a learning-based method for extracting whistles of toothed whales (Odontoceti) in hydrophone recordings. Our method represents audio signals as time-frequency spectrograms and decomposes each spectrogram into a set of time-frequency patche s. A deep neural network learns archetypical patterns (e.g., crossings, frequency modulated sweeps) from the spectrogram patches and predicts time-frequency peaks that are associated with whistles. We also developed a comprehensive method to synthesize training samples from background environments and train the network with minimal human annotation effort. We applied the proposed learn-from-synthesis method to a subset of the public Detection, Classification, Localization, and Density Estimation (DCLDE) 2011 workshop data to extract whistle confidence maps, which we then processed with an existing contour extractor to produce whistle annotations. The F1-score of our best synthesis method was 0.158 greater than our baseline whistle extraction algorithm (~25% improvement) when applied to common dolphin (Delphinus spp.) and bottlenose dolphin (Tursiops truncatus) whistles.
The availability of a large number of assembled genomes opens the way to study the evolution of syntenic character within a phylogenetic context. The DeCo algorithm, recently introduced by B{e}rard et al. allows the computation of parsimonious evolut ionary scenarios for gene adjacencies, from pairs of reconciled gene trees. Following the approach pioneered by Sturmfels and Pachter, we describe how to modify the DeCo dynamic programming algorithm to identify classes of cost schemes that generates similar parsimonious evolutionary scenarios for gene adjacencies, as well as the robustness to changes to the cost scheme of evolutionary events of the presence or absence of specific ancestral gene adjacencies. We apply our method to six thousands mammalian gene families, and show that computing the robustness to changes to cost schemes provides new and interesting insights on the evolution of gene adjacencies and the DeCo model.
Understanding international trade is a fundamental problem in economics -- one standard approach is via what is commonly called the gravity equation, which predicts the total amount of trade $F_ij$ between two countries $i$ and $j$ as $$ F_{ij} = G f rac{M_i M_j}{D_{ij}},$$ where $G$ is a constant, $M_i, M_j$ denote the economic mass (often simply the gross domestic product) and $D_{ij}$ the distance between countries $i$ and $j$, where distance is a complex notion that includes geographical, historical, linguistic and sociological components. We take the textit{inverse} route and ask ourselves to which extent it is possible to reconstruct meaningful information about countries simply from knowing the bilateral trade volumes $F_{ij}$: indeed, we show that a remarkable amount of geopolitical information can be extracted. The main tool is a spectral decomposition of the Graph Laplacian as a tool to perform nonlinear dimensionality reduction. This may have further applications in economic analysis and provides a data-based approach to trade distance.
Preterm infants are at high risk of developing brain injury in the first days of life as a consequence of poor cerebral oxygen delivery. Near-infrared spectroscopy (NIRS) is an established technology developed to monitor regional tissue oxygenation. Detailed waveform analysis of the cerebral NIRS signal could improve the clinical utility of this method in accurately predicting brain injury. Frequent transient cerebral oxygen desaturations are commonly observed in extremely preterm infants, yet their clinical significance remains unclear. The aim of this study was to examine and compare the performance of two distinct approaches in isolating and extracting transient deflections within NIRS signals. We optimized three different simultaneous low-pass filtering and total variation denoising (LPF_TVD) methods and compared their performance with a recently proposed method that uses singular-spectrum analysis and the discrete cosine transform (SSA_DCT). Parameters for the LPF_TVD methods were optimized over a grid search using synthetic NIRS-like signals. The SSA_DCT method was modified with a post-processing procedure to increase sparsity in the extracted components. Our analysis, using a synthetic NIRS-like dataset, showed that a LPF_TVD method outperformed the modified SSA_DCT method: median mean-squared error of 0.97 (95% CI: 0.86 to 1.07) was lower for the LPF_TVD method compared to the modified SSA_DCT method of 1.48 (95% CI: 1.33 to 1.63), P<0.001. The dual low-pass filter and total variation denoising methods are considerably more computational efficient, by 3 to 4 orders of magnitude, than the SSA_DCT method. More research is needed to examine the efficacy of these methods in extracting oxygen desaturation in real NIRS signals.
This paper introduces TwitterPaul, a system designed to make use of Social Media data to help to predict game outcomes for the 2010 FIFA World Cup tournament. To this end, we extracted over 538K mentions to football games from a large sample of tweet s that occurred during the World Cup, and we classified into different types with a precision of up to 88%. The different mentions were aggregated in order to make predictions about the outcomes of the actual games. We attempt to learn which Twitter users are accurate predictors and explore several techniques in order to exploit this information to make more accurate predictions. We compare our results to strong baselines and against the betting line (prediction market) and found that the quality of extractions is more important than the quantity, suggesting that high precision methods working on a medium-sized dataset are preferable over low precision methods that use a larger amount of data. Finally, by aggregating some classes of predictions, the system performance is close to the one of the betting line. Furthermore, we believe that this domain independent framework can help to predict other sports, elections, product release dates and other future events that people talk about in social media.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا