Prediction in ungauged regions with sparse flow duration curves and input-selection ensemble modeling

54 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Chaopeng Shen

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Dapeng Feng - Kathryn Lawson - Chaopeng Shen

التعلم الآلي الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

While long short-term memory (LSTM) models have demonstrated stellar performance with streamflow predictions, there are major risks in applying these models in contiguous regions with no gauges, or predictions in ungauged regions (PUR) problems. However, softer data such as the flow duration curve (FDC) may be already available from nearby stations, or may become available. Here we demonstrate that sparse FDC data can be migrated and assimilated by an LSTM-based network, via an encoder. A stringent region-based holdout test showed a median Kling-Gupta efficiency (KGE) of 0.62 for a US dataset, substantially higher than previous state-of-the-art global-scale ungauged basin tests. The baseline model without FDC was already competitive (median KGE 0.56), but integrating FDCs had substantial value. Because of the inaccurate representation of inputs, the baseline models might sometimes produce catastrophic results. However, model generalizability was further meaningfully improved by compiling an ensemble based on models with different input selections.

قيم البحث

65 - Liyue Chen , Leye Wang 2021

In the big data and AI era, context is widely exploited as extra information which makes it easier to learn a more complex pattern in machine learning systems. However, most of the existing related studies seldom take context into account. The diffic ulty lies in the unknown generalization ability of both context and its modeling techniques across different scenarios. To fill the above gaps, we conduct a large-scale analytical and empirical study on the spatiotemporal crowd prediction (STCFP) problem that is a widely-studied and hot research topic. We mainly make three efforts:(i) we develop new taxonomy about both context features and context modeling techniques based on extensive investigations in prevailing STCFP research; (ii) we conduct extensive experiments on seven datasets with hundreds of millions of records to quantitatively evaluate the generalization ability of both distinct context features and context modeling techniques; (iii) we summarize some guidelines for researchers to conveniently utilize context in diverse applications.

التعلم الآلي الذكاء الاصطناعي

Model-Attentive Ensemble Learning for Sequence Modeling

76 - Victor D. Bourgin , Ioana Bica , Mihaela van der Schaar 2021

Medical time-series datasets have unique characteristics that make prediction tasks challenging. Most notably, patient trajectories often contain longitudinal variations in their input-output relationships, generally referred to as temporal condition al shift. Designing sequence models capable of adapting to such time-varying distributions remains a prevailing problem. To address this we present Model-Attentive Ensemble learning for Sequence modeling (MAES). MAES is a mixture of time-series experts which leverages an attention-based gating mechanism to specialize the experts on different sequence dynamics and adaptively weight their predictions. We demonstrate that MAES significantly out-performs popular sequence models on datasets subject to temporal shift.

التعلم الآلي الذكاء الاصطناعي

Uncertainty Characteristics Curves: A Systematic Assessment of Prediction Intervals

76 - Jiri Navratil , Benjamin Elder , Matthew Arnold 2021

Accurate quantification of model uncertainty has long been recognized as a fundamental requirement for trusted AI. In regression tasks, uncertainty is typically quantified using prediction intervals calibrated to a specific operating point, making ev aluation and comparison across different studies difficult. Our work leverages: (1) the concept of operating characteristics curves and (2) the notion of a gain over a simple reference, to derive a novel operating point agnostic assessment methodology for prediction intervals. The paper describes the corresponding algorithm, provides a theoretical analysis, and demonstrates its utility in multiple scenarios. We argue that the proposed method addresses the current need for comprehensive assessment of prediction intervals and thus represents a valuable addition to the uncertainty quantification toolbox.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

DeepExpress: Heterogeneous and Coupled Sequence Modeling for Express Delivery Prediction

75 - Siyuan Ren , Bin Guo , Longbing Cao 2021

The prediction of express delivery sequence, i.e., modeling and estimating the volumes of daily incoming and outgoing parcels for delivery, is critical for online business, logistics, and positive customer experience, and specifically for resource al location optimization and promotional activity arrangement. A precise estimate of consumer delivery requests has to involve sequential factors such as shopping behaviors, weather conditions, events, business campaigns, and their couplings. Besides, conventional sequence prediction assumes a stable sequence evolution, failing to address complex nonlinear sequences and various feature effects in the above multi-source data. Although deep networks and attention mechanisms demonstrate the potential of complex sequence modeling, extant networks ignore the heterogeneous and coupling situation between features and sequences, resulting in weak prediction accuracy. To address these issues, we propose DeepExpress - a deep-learning based express delivery sequence prediction model, which extends the classic seq2seq framework to learning complex coupling between sequence and features. DeepExpress leverages an express delivery seq2seq learning, a carefully-designed heterogeneous feature representation, and a novel joint training attention mechanism to adaptively map heterogeneous data, and capture sequence-feature coupling for precise estimation. Experimental results on real-world data demonstrate that the proposed method outperforms both shallow and deep baseline models.

التعلم الآلي الذكاء الاصطناعي

Spatial-Temporal Self-Attention Network for Flow Prediction

167 - Haoxing Lin , Weijia Jia , Yiping Sun 2019

Flow prediction (e.g., crowd flow, traffic flow) with features of spatial-temporal is increasingly investigated in AI research field. It is very challenging due to the complicated spatial dependencies between different locations and dynamic temporal dependencies among different time intervals. Although measurements of both dependencies are employed, existing methods suffer from the following two problems. First, the temporal dependencies are measured either uniformly or bias against long-term dependencies, which overlooks the distinctive impacts of short-term and long-term temporal dependencies. Second, the existing methods capture spatial and temporal dependencies independently, which wrongly assumes that the correlations between these dependencies are weak and ignores the complicated mutual influences between them. To address these issues, we propose a Spatial-Temporal Self-Attention Network (ST-SAN). As the path-length of attending long-term dependency is shorter in the self-attention mechanism, the vanishing of long-term temporal dependencies is prevented. In addition, since our model relies solely on attention mechanisms, the spatial and temporal dependencies can be simultaneously measured. Experimental results on real-world data demonstrate that, in comparison with state-of-the-art methods, our model reduces the root mean square errors by 9% in inflow prediction and 4% in outflow prediction on Taxi-NYC data, which is very significant compared to the previous improvement.

التعلم الآلي الذكاء الاصطناعي التعلم الالي