ترغب بنشر مسار تعليمي؟ اضغط هنا

On the Memory Mechanism of Tensor-Power Recurrent Models

43   0   0.0 ( 0 )
 نشر من قبل Chao Li
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures.



قيم البحث

اقرأ أيضاً

This paper introduces recurrent equilibrium networks (RENs), a new class of nonlinear dynamical models for applications in machine learning, system identification and control. The new model class has ``built in guarantees of stability and robustness: all models in the class are contracting - a strong form of nonlinear stability - and models can satisfy prescribed incremental integral quadratic constraints (IQC), including Lipschitz bounds and incremental passivity. RENs are otherwise very flexible: they can represent all stable linear systems, all previously-known sets of contracting recurrent neural networks and echo state networks, all deep feedforward neural networks, and all stable Wiener/Hammerstein models. RENs are parameterized directly by a vector in R^N, i.e. stability and robustness are ensured without parameter constraints, which simplifies learning since generic methods for unconstrained optimization can be used. The performance and robustness of the new model set is evaluated on benchmark nonlinear system identification problems, and the paper also presents applications in data-driven nonlinear observer design and control with stability guarantees.
In electricity markets, locational marginal price (LMP) forecasting is particularly important for market participants in making reasonable bidding strategies, managing potential trading risks, and supporting efficient system planning and operation. U nlike existing methods that only consider LMPs temporal features, this paper tailors a spectral graph convolutional network (GCN) to greatly improve the accuracy of short-term LMP forecasting. A three-branch network structure is then designed to match the structure of LMPs compositions. Such kind of network can extract the spatial-temporal features of LMPs, and provide fast and high-quality predictions for all nodes simultaneously. The attention mechanism is also implemented to assign varying importance weights between different nodes and time slots. Case studies based on the IEEE-118 test system and real-world data from the PJM validate that the proposed model outperforms existing forecasting models in accuracy, and maintains a robust performance by avoiding extreme errors.
319 - Xinan Wang , Yishen Wang , Di Shi 2020
Power transfer limits or transfer capability (TC) directly relate to the system operation and control as well as electricity markets. As a consequence, their assessment has to comply with static constraints, such as line thermal limits, and dynamic c onstraints, such as transient stability limits, voltage stability limits and small-signal stability limits. Since the load dynamics have substantial impacts on power system transient stability, load models are one critical factor that affects the power transfer limits. Currently, multiple load models have been proposed and adopted in the industry and academia, including the ZIP model, ZIP plus induction motor composite model (ZIP + IM) and WECC composite load model (WECC CLM). Each of them has its unique advantages, but their impacts on the power transfer limits are not yet adequately addressed. One existing challenge is fitting the high-order nonlinear models such as WECC CLM. In this study, we innovatively adopt double deep Q-learning Network (DDQN) agent as a general load modeling tool in the dynamic assessment procedure and fit the same transient field measurements into different load models. A comprehensive evaluation is then conducted to quantify the load models impacts on the power transfer limits. The simulation environment is the IEEE-39 bus system constructed in Transient Security Assessment Tool (TSAT).
174 - Yoav Levine , Or Sharir , Alon Ziv 2017
A key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ability to model intricate long-term temporal dependencies. However, a well established measure of RNNs long-term memory capacity is lacking, and thus formal understanding of the effect of depth on their ability to correlate data throughout time is limited. Specifically, existing depth efficiency results on convolutional networks do not suffice in order to account for the success of deep RNNs on data of varying lengths. In order to address this, we introduce a measure of the networks ability to support information flow across time, referred to as the Start-End separation rank, which reflects the distance of the function realized by the recurrent network from modeling no dependency between the beginning and end of the input sequence. We prove that deep recurrent networks support Start-End separation ranks which are combinatorially higher than those supported by their shallow counterparts. Thus, we establish that depth brings forth an overwhelming advantage in the ability of recurrent networks to model long-term dependencies, and provide an exemplar of quantifying this key attribute which may be readily extended to other RNN architectures of interest, e.g. variants of LSTM networks. We obtain our results by considering a class of recurrent networks referred to as Recurrent Arithmetic Circuits, which merge the hidden state with the input via the Multiplicative Integration operation, and empirically demonstrate the discussed phenomena on common RNNs. Finally, we employ the tool of quantum Tensor Networks to gain additional graphic insight regarding the complexity brought forth by depth in recurrent networks.
This paper presents competitive algorithms for a novel class of online optimization problems with memory. We consider a setting where the learner seeks to minimize the sum of a hitting cost and a switching cost that depends on the previous $p$ decisi ons. This setting generalizes Smoothed Online Convex Optimization. The proposed approach, Optimistic Regularized Online Balanced Descent, achieves a constant, dimension-free competitive ratio. Further, we show a connection between online optimization with memory and online control with adversarial disturbances. This connection, in turn, leads to a new constant-competitive policy for a rich class of online control problems.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا