Doctor of Philosophy (PhD)
Electrical and Computer Engineering
This dissertation seeks to find optimal graphical tree model for low dimensional representation of vector Gaussian distributions. For a special case we assumed that the population co-variance matrix $\Sigma_x$ has an additional latent graphical constraint, namely, a latent star topology. We have found the Constrained Minimum Determinant Factor Analysis (CMDFA) and Constrained Minimum Trace Factor Analysis (CMTFA) decompositions of this special $\Sigma_x$ in connection with the operational meanings of the respective solutions. Characterizing the CMDFA solution of special $\Sigma_x$, according to the second interpretation of Wyner's common information, is equivalent to solving the source coding problem of finding the minimum rate of information required to synthesize a vector following distribution arbitrarily close to the observed vector. In search of finding optimal solution to the common information problem for more general population co-variance matrices where the closed-form solutions are non existent, we have proposed a novel neural network based approach.
In the theoretical segment of this dissertation, we have shown that for this special $\Sigma_x$ both CMDFA and CMTFA can have either a rank $ 1 $ or a rank $ n-1 $ solution and nothing in between. For both CMDFA and CMTFA, the special case of a rank $ 1 $ solution, corresponds to the case where just one latent variable captures all the dependencies among the observables giving rise to a star topology. We found explicit conditions for both rank $ 1 $ and rank $n- 1$ solutions for CMDFA as well as CMTFA. We have analytically characterized the common solution space that CMDFA and CMTFA share with each other despite working with different objective functions.
In the computational segment of this dissertation, we have proposed a novel variational approach to solve common information problem for more general data i.e. non-star yet Gaussian data or even non-Gaussian data. Our approach is devoted to searching for a model that can capture the constraints of the common information problem. We studied the Variational Auto-encoder (VAE) framework as a potential candidate to capture the constraints of the common information problem and established some insightful connections between VAE structure and the common information problem. So far we have designed and implemented four different neural network based models and all of them incorporates the VAE framework in their structure. We have formulated a set of metrics to justify the closeness of the obtained results by these models to the desired benchmarks. The theoretical CMDFA solution obtained for the special cases serves as the benchmark when it comes to testing the efficacy of the variational models we designed. Considering the ease of analysis our investigation so far has been limited to $3$-dimensional data. Our investigation has revealed some interesting insights about the trade-off between model capacity and the intricacy of data distribution. Our next plan is to design a hybrid model combining the useful properties from different models. We will keep exploring in pursuit of a variational model capable of finding an optimal common information solution for higher dimensional data underlying arbitrary structures.
Hasan, Md Mahmudul, "Efficient Low Dimensional Representation of Vector Gaussian Distributions" (2022). LSU Doctoral Dissertations. 5833.