Identifier
etd-05192015-103813
Degree
Master of Science in Electrical Engineering (MSEE)
Department
Electrical and Computer Engineering
Document Type
Thesis
Abstract
Facing the enormous volumes of data available nowadays, we try to extract useful information from the data by properly modeling and characterizing the data. In this thesis, we focus on one particular type of semantic data --- online movie reviews, which can be found on all major movie websites. Our objective is mining movie review data to seek quantifiable patterns between reviews on the same movie, or reviews from the same reviewer. A novel approach is presented in this thesis to achieve this goal. The key idea is converting a movie review text into a list of tuples, where each tuple contains four elements: feature word, category of feature word, opinion word and polarity of opinion word. Then we further convert each tuple into an 18-dimension vector. Given a multinomial distribution representing a movie review, we can systematically and consistently quantify the similarity and dependence between reviews made by the same or different reviewers using metrics including KL distance and distance correlation, respectively. Such comparisons allow us to find reviewers sharing similarity in generated multinomial distributions, or demonstrating correlation patterns to certain extent. Among the identified pairs of frequent reviewers, we further investigate the category-wise dependency relationships between two reviewers, which are further captured by our proposed ordinary least square estimation models. The proposed data processing approaches, as well as the corresponding modeling framework, could be further leveraged to develop classification, prediction, and common randomness extraction algorithms for semantic movie review data.
Date
2015
Document Availability at the Time of Submission
Release the entire work immediately for access worldwide.
Recommended Citation
Pu, Limeng, "Stochastic Modeling of Semantic Structures of Online Movie Reviews" (2015). LSU Master's Theses. 1153.
https://repository.lsu.edu/gradschool_theses/1153
Committee Chair
Wei, Shuangqing
DOI
10.31390/gradschool_theses.1153