No Arabic abstract
As a tool for capturing irregular temporal dependencies (rather than resorting to binning temporal observations to construct time series), Hawkes processes with exponential decay have seen widespread adoption across many application domains, such as predicting the occurrence time of the next earthquake or stock market spike. However, practical applications of Hawkes processes face a noteworthy challenge: There is substantial and often unquantified variance in decay parameter estimations, especially in the case of a small number of observations or when the dynamics behind the observed data suddenly change. We empirically study the cause of these practical challenges and we develop an approach to surface and thereby mitigate them. In particular, our inspections of the Hawkes process likelihood function uncover the properties of the uncertainty when fitting the decay parameter. We thus propose to explicitly capture this uncertainty within a Bayesian framework. With a series of experiments with synthetic and real-world data from domains such as classical earthquake modeling or the manifestation of collective emotions on Twitter, we demonstrate that our proposed approach helps to quantify uncertainty and thereby to understand and fit Hawkes processes in practice.
There has recently been a concerted effort to derive mechanisms in vision and machine learning systems to offer uncertainty estimates of the predictions they make. Clearly, there are enormous benefits to a system that is not only accurate but also has a sense for when it is not sure. Existing proposals center around Bayesian interpretations of modern deep architectures -- these are effective but can often be computationally demanding. We show how classical ideas in the literature on exponential families on probabilistic networks provide an excellent starting point to derive uncertainty estimates in Gated Recurrent Units (GRU). Our proposal directly quantifies uncertainty deterministically, without the need for costly sampling-based estimation. We demonstrate how our model can be used to quantitatively and qualitatively measure uncertainty in unsupervised image sequence prediction. To our knowledge, this is the first result describing sampling-free uncertainty estimation for powerful sequential models such as GRUs.
Asynchronous events on the continuous time domain, e.g., social media actions and stock transactions, occur frequently in the world. The ability to recognize occurrence patterns of event sequences is crucial to predict which typeof events will happen next and when. A de facto standard mathematical framework to do this is the Hawkes process. In order to enhance expressivity of multivariate Hawkes processes, conventional statistical methods and deep recurrent networks have been employed to modify its intensity function. The former is highly interpretable and requires small size of training data but relies on correct model design while the latter has less dependency on prior knowledge and is more powerful in capturing complicated patterns. We leverage pros and cons of these models and propose a self-attentive Hawkes process(SAHP). The proposed method adapts self-attention to fit the intensity function of Hawkes processes. This design has two benefits:(1) compared with conventional statistical methods, the SAHP is more powerful to identify complicated dependency relationships between temporal events; (2)compared with deep recurrent networks, the self-attention mechanism is able to capture longer historical information, and is more interpretable because the learnt attention weight tensor shows contributions of each historical event. Experiments on four real-world datasets demonstrate the effectiveness of the proposed method.
This work builds a novel point process and tools to use the Hawkes process with interval-censored data. Such data records the aggregated counts of events solely during specific time intervals -- such as the number of patients admitted to the hospital or the volume of vehicles passing traffic loop detectors -- and not the exact occurrence time of the events. First, we establish the Mean Behavior Poisson (MBP) process, a novel Poisson process with a direct parameter correspondence to the popular self-exciting Hawkes process. The event intensity function of the MBP is the expected intensity over all possible Hawkes realizations with the same parameter set. We fit MBP in the interval-censored setting using an interval-censored Poisson log-likelihood (IC-LL). We use the parameter equivalence to uncover the parameters of the associated Hawkes process. Second, we introduce two novel exogenous functions to distinguish the exogenous from the endogenous events. We propose the multi-impulse exogenous function when the exogenous events are observed as event time and the latent homogeneous Poisson process exogenous function when the exogenous events are presented as interval-censored volumes. Third, we provide several approximation methods to estimate the intensity and compensator function of MBP when no analytical solution exists. Fourth and finally, we connect the interval-censored loss of MBP to a broader class of Bregman divergence-based functions. Using the connection, we show that the current state of the art in popularity estimation (Hawkes Intensity Process (HIP) (Rizoiu et al.,2017b)) is a particular case of the MBP process. We verify our models through empirical testing on synthetic data and real-world data. We find that on real-world datasets that our MBP process outperforms HIP for the task of popularity prediction.
We propose a novel framework for modeling multiple multivariate point processes, each with heterogeneous event types that share an underlying space and obey the same generative mechanism. Focusing on Hawkes processes and their variants that are associated with Granger causality graphs, our model leverages an uncountable event type space and samples the graphs with different sizes from a nonparametric model called {it graphon}. Given those graphs, we can generate the corresponding Hawkes processes and simulate event sequences. Learning this graphon-based Hawkes process model helps to 1) infer the underlying relations shared by different Hawkes processes; and 2) simulate event sequences with different event types but similar dynamics. We learn the proposed model by minimizing the hierarchical optimal transport distance between the generated event sequences and the observed ones, leading to a novel reward-augmented maximum likelihood estimation method. We analyze the properties of our model in-depth and demonstrate its rationality and effectiveness in both theory and experiments.
This paper studies nonparametric estimation of parameters of multivariate Hawkes processes. We consider the Bayesian setting and derive posterior concentration rates. First rates are derived for L1-metrics for stochastic intensities of the Hawkes process. We then deduce rates for the L1-norm of interactions functions of the process. Our results are exemplified by using priors based on piecewise constant functions, with regular or random partitions and priors based on mixtures of Betas distributions. Numerical illustrations are then proposed with in mind applications for inferring functional connec-tivity graphs of neurons.