Degree

Doctor of Philosophy (PhD)

Department

Mathematics

Document Type

Dissertation

Abstract

This research explores both theoretical and practical aspects of disentangled representation learning by extending the VAE framework. We address the core challenge of extracting independent generative factors from observed data while preserving high reconstruction fidelity. To this end, we propose two novel VAE variants: (i) the $\lambda\beta$-VAE, which incorporates an additional $\ell^2$-norm reconstruction loss to improve accuracy, and (ii) the $\gamma\beta$-VAE, which introduces a mutual information regularization term to encourage independence across latent dimensions.

Our theoretical analysis is conducted in a linear Gaussian setting, where we derive optimal solutions for these VAE-based models. We further examine how varying levels of correlation among ground truth factors influence disentanglement performance, using established metrics such as SAP and MIG, along with a newly proposed mutual information-based metric, $I_m$. Empirical evaluations on the dSprites dataset show that introducing explicit architectural constraints can effectively address key limitations of standard VAE approaches.

Date

4-15-2025

Committee Chair

Wan, Xiaoliang

Share

COinS