Citation Link: https://doi.org/10.25819/ubsi/10472
Optimizing the latent space of deep generative models
Alternate Title
Optimierung des latenten Raums von tiefen generativen Modellen
Source Type
Doctoral Thesis
Author
Saseendran, Amrutha
Institute
Issue Date
2023
Abstract
Deep generative models are powerful machine learning models used to model high-dimensional complex data distributions. The rich and semantically expressive latent representations learned by these models are used for various downstream applications in computer vision and natural language processing. It is evident that the effectiveness of the generative techniques highly depends on the quality of the learned representations. Hence in this dissertation, we focus on improving the desirable properties of the learned latent space of two popular deep generative models, Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). Specifically, we focus on properties such as generalizability, ontrollability, smoothness, and adversarial robustness.
In the first technical contribution we present in this work, we focus on improving the controllability of latent representations in GANs to generate high-quality images. To be precise, we propose a method to control the content of the generated images solely based on the defined number of objects from multiple classes and introduce a state-of-the-art conditioned adversarial network. We also introduce a real-world count-based dataset called CityCount to validate our results in challenging scenarios.
Next, we explore the learned representations of VAEs and some of the practical limitations associated with them. To this end, we propose a simple, novel, and end-to-end trainable deterministic autoencoding method that efficiently structures the latent space of the model during training and leverages the capacity of expressive multimodal latent distributions. We demonstrate the potential of the proposed method for modeling both continuous and discrete data structures. Finally, we investigate the adversarial robustness of the learned representations in VAEs. One of the major limitations in existing robust VAE models is the trade-off between the quality of image generation and the robustness achieved. We show that the learned representations in the proposed regularized deterministic autoencoders with a comparatively cheap adversarial learning scheme exhibit superior robustness to adversarial attacks without compromising the quality of image generation.
In the first technical contribution we present in this work, we focus on improving the controllability of latent representations in GANs to generate high-quality images. To be precise, we propose a method to control the content of the generated images solely based on the defined number of objects from multiple classes and introduce a state-of-the-art conditioned adversarial network. We also introduce a real-world count-based dataset called CityCount to validate our results in challenging scenarios.
Next, we explore the learned representations of VAEs and some of the practical limitations associated with them. To this end, we propose a simple, novel, and end-to-end trainable deterministic autoencoding method that efficiently structures the latent space of the model during training and leverages the capacity of expressive multimodal latent distributions. We demonstrate the potential of the proposed method for modeling both continuous and discrete data structures. Finally, we investigate the adversarial robustness of the learned representations in VAEs. One of the major limitations in existing robust VAE models is the trade-off between the quality of image generation and the robustness achieved. We show that the learned representations in the proposed regularized deterministic autoencoders with a comparatively cheap adversarial learning scheme exhibit superior robustness to adversarial attacks without compromising the quality of image generation.
File(s)![Thumbnail Image]()
Loading...
Name
Dissertation_Saseendran_Amrutha.pdf
Size
13.11 MB
Format
Adobe PDF
Checksum
(MD5):2bb34cd494e45ebb8ef8d605f99ace18
Owning collection