Sparse Transformer Hawkes Process for Long Event Sequences
Document Type
Conference Proceeding
Publication Date
1-1-2023
Abstract
Large quantities of asynchronous event sequence data such as crime records, emergence call logs, and financial transactions are becoming increasingly available from various fields. These event sequences often exhibit both long-term and short-term temporal dependencies. Variations of neural network based temporal point processes have been widely used for modeling such asynchronous event sequences. However, many current architectures including attention based point processes struggle with long event sequences due to computational inefficiency. To tackle the challenge, we propose an efficient sparse transformer Hawkes process (STHP), which has two components. For the first component, a transformer with a novel temporal sparse self-attention mechanism is applied to event sequences with arbitrary intervals, mainly focusing on shortterm dependencies. For the second component, a transformer is applied to the time series of aggregated event counts, primarily targeting the extraction of long-term periodic dependencies. Both components complement each other and are fused together to model the conditional intensity function of a point process for future event forecasting. Experiments on real-world datasets show that the proposed STHP outperforms baselines and achieves significant improvement in computational efficiency without sacrificing prediction performance for long sequences.
Publication Source (Journal or Book title)
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
First Page
172
Last Page
188
Recommended Citation
Li, Z., & Sun, M. (2023). Sparse Transformer Hawkes Process for Long Event Sequences. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 14173 LNAI, 172-188. https://doi.org/10.1007/978-3-031-43424-2_11