DeepExpress: Heterogeneous and Coupled Sequence Modeling for Express Delivery Prediction

76 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Bin Guo

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Siyuan Ren - Bin Guo - Longbing Cao

التعلم الآلي الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The prediction of express delivery sequence, i.e., modeling and estimating the volumes of daily incoming and outgoing parcels for delivery, is critical for online business, logistics, and positive customer experience, and specifically for resource allocation optimization and promotional activity arrangement. A precise estimate of consumer delivery requests has to involve sequential factors such as shopping behaviors, weather conditions, events, business campaigns, and their couplings. Besides, conventional sequence prediction assumes a stable sequence evolution, failing to address complex nonlinear sequences and various feature effects in the above multi-source data. Although deep networks and attention mechanisms demonstrate the potential of complex sequence modeling, extant networks ignore the heterogeneous and coupling situation between features and sequences, resulting in weak prediction accuracy. To address these issues, we propose DeepExpress - a deep-learning based express delivery sequence prediction model, which extends the classic seq2seq framework to learning complex coupling between sequence and features. DeepExpress leverages an express delivery seq2seq learning, a carefully-designed heterogeneous feature representation, and a novel joint training attention mechanism to adaptively map heterogeneous data, and capture sequence-feature coupling for precise estimation. Experimental results on real-world data demonstrate that the proposed method outperforms both shallow and deep baseline models.

قيم البحث

76 - Haobing Liu , Yanmin Zhu , Tianzi Zang 2021

Prediction tasks about students have practical significance for both student and college. Making multiple predictions about students is an important part of a smart campus. For instance, predicting whether a student will fail to graduate can alert th e student affairs office to take predictive measures to help the student improve his/her academic performance. With the development of information technology in colleges, we can collect digital footprints which encode heterogeneous behaviors continuously. In this paper, we focus on modeling heterogeneous behaviors and making multiple predictions together, since some prediction tasks are related and learning the model for a specific task may have the data sparsity problem. To this end, we propose a variant of LSTM and a soft-attention mechanism. The proposed LSTM is able to learn the student profile-aware representation from heterogeneous behavior sequences. The proposed soft-attention mechanism can dynamically learn different importance degrees of different days for every student. In this way, heterogeneous behaviors can be well modeled. In order to model interactions among multiple prediction tasks, we propose a co-attention mechanism based unit. With the help of the stacked units, we can explicitly control the knowledge transfer among multiple tasks. We design three motivating behavior prediction tasks based on a real-world dataset collected from a college. Qualitative and quantitative experiments on the three prediction tasks have demonstrated the effectiveness of our model.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Trellis Networks for Sequence Modeling

118 - Shaojie Bai , J. Zico Kolter , Vladlen Koltun 2018

We present trellis networks, a new architecture for sequence modeling. On the one hand, a trellis network is a temporal convolutional network with special structure, characterized by weight tying across depth and direct injection of the input into de ep layers. On the other hand, we show that truncated recurrent networks are equivalent to trellis networks with special sparsity structure in their weight matrices. Thus trellis networks with general weight matrices generalize truncated recurrent networks. We leverage these connections to design high-performing trellis networks that absorb structural and algorithmic elements from both recurrent and convolutional models. Experiments demonstrate that trellis networks outperform the current state of the art methods on a variety of challenging benchmarks, including word-level language modeling and character-level language modeling tasks, and stress tests designed to evaluate long-term memory retention. The code is available at https://github.com/locuslab/trellisnet .

التعلم الآلي الذكاء الاصطناعي الحساب واللغة

Foundations of Sequence-to-Sequence Modeling for Time Series

83 - Vitaly Kuznetsov , Zelda Mariet 2018

The availability of large amounts of time series data, paired with the performance of deep-learning algorithms on a broad class of problems, has recently led to significant interest in the use of sequence-to-sequence models for time series forecastin g. We provide the first theoretical analysis of this time series forecasting framework. We include a comparison of sequence-to-sequence modeling to classical time series models, and as such our theory can serve as a quantitative guide for practitioners choosing between different modeling methodologies.

التعلم الآلي الذكاء الاصطناعي التعلم الالي

Model-Attentive Ensemble Learning for Sequence Modeling

76 - Victor D. Bourgin , Ioana Bica , Mihaela van der Schaar 2021

Medical time-series datasets have unique characteristics that make prediction tasks challenging. Most notably, patient trajectories often contain longitudinal variations in their input-output relationships, generally referred to as temporal condition al shift. Designing sequence models capable of adapting to such time-varying distributions remains a prevailing problem. To address this we present Model-Attentive Ensemble learning for Sequence modeling (MAES). MAES is a mixture of time-series experts which leverages an attention-based gating mechanism to specialize the experts on different sequence dynamics and adaptively weight their predictions. We demonstrate that MAES significantly out-performs popular sequence models on datasets subject to temporal shift.

التعلم الآلي الذكاء الاصطناعي

Modeling Heterogeneous Relations across Multiple Modes for Potential Crowd Flow Prediction

89 - Qiang Zhou , Jingjing Gu , Xinjiang Lu 2021

Potential crowd flow prediction for new planned transportation sites is a fundamental task for urban planners and administrators. Intuitively, the potential crowd flow of the new coming site can be implied by exploring the nearby sites. However, the transportation modes of nearby sites (e.g. bus stations, bicycle stations) might be different from the target site (e.g. subway station), which results in severe data scarcity issues. To this end, we propose a data driven approach, named MOHER, to predict the potential crowd flow in a certain mode for a new planned site. Specifically, we first identify the neighbor regions of the target site by examining the geographical proximity as well as the urban function similarity. Then, to aggregate these heterogeneous relations, we devise a cross-mode relational GCN, a novel relation-specific transformation model, which can learn not only the correlations but also the differences between different transportation modes. Afterward, we design an aggregator for inductive potential flow representation. Finally, an LTSM module is used for sequential flow prediction. Extensive experiments on real-world data sets demonstrate the superiority of the MOHER framework compared with the state-of-the-art algorithms.

التعلم الآلي