Doctor of Philosophy (PhD)


Computer Science

Document Type



Growing volumes and varieties of human event sequence data are available in many applications such as recommender systems, social network, medical diagnosis, and predictive policing. Human event sequence data is usually clustered and exhibits self-exciting properties. Machine learning models especially deep neural network models have shown great potential in improving the prediction accuracy of future events. However, current approaches still suffer from several drawbacks such as model transparency, unfair prediction and the poor prediction accuracy due to data sparsity and bias. Another issue in modeling human event data is that data collected from real word is usually incomplete, and even biased. Predictive modeling of sparse and biased data can lead to an inaccurate model and unfair predictions.

In this work, we design new methodologies for human event sequence data and tackle the above challenges. (1) We propose to use a linear mixed model to mimic the local behavior of any complex model on clustered event data, which can also improve the fidelity of the explanation method to the complex models. (2) For incomplete and biased event data, we assume that there is possibility that events can be missing between any two observed events. We propose a novel multivariate Hawkes process with a revised likelihood function integrating missing window probabilities. We also incorporate event features to regularize model parameters. We carry on experiments over several real-word datasets including movie recommendation, medical record diagnosis, disaster rescue events, crime events and user-item interaction events. The results demonstrate that our models outperform state-of-arts.



Committee Chair

Sun, Mingxuan