On the confluence of machine learning and model-based energy minimization methods for computer vision

Dröge, Hannah

doi:10.25819/ubsi/10537

Citation Link: https://doi.org/10.25819/ubsi/10537

On the confluence of machine learning and model-based energy minimization methods for computer vision

Alternate Title

Über das Zusammenwirken von maschinellem Lernen und modellbasierten Energieminimierungsmethoden für Computer Vision

Publication Type

Doctoral Thesis

Author

Dröge, Hannah

Institute

Department Elektrotechnik - Informatik

Subjects

Computer vision

Machine learning

Energy minimization

DDC

004 Informatik

GHBS-Clases

TVUC

TUH

Issue Date

2024

Abstract

Deep learning has achieved great success in the field of computer vision across a wide range of applications. However, learning-based methods still have several limitations, particularly in terms of interpretability and guarantees. In contrast, traditional model-based computer vision techniques, built on explicit models that are derived from our understanding of the specific problem domain, offer a different and interpretable approach on addressing these challenges.
In this work, we analyze and further develop hybrid approaches that combine model-based and learning-based methods in computer vision, introducing four different approaches. We analyze the capabilities of both model-based and learning-based methods, discuss the value of deep learning for underdetermined problems, present an extended approach to incorporate learning directly into the optimization process, and address problems where the challenge lies in the intrinsic formulation of the problem itself. Thereby we deal with different application areas in the field of computer vision. We start with studying segmentation problems on a single image, given only user input in the form of drawn scribbles in the color images, and analyze the performance of learning-based methods to incorporate the scribble information, compared to a cleverly designed model-based approach. Further, we address reconstruction problems, focusing on underdetermined computed tomography reconstructions of lung scans. We integrate a learning-based regularizer into the reconstruction process and explore the space of possible data-consistent reconstructions corresponding to various degrees of pathological malignancy. Also, to integrate neural networks into model-based approaches, we build on recent studies, which aim to learn iterative descent directions for minimizing model-based cost functions. By applying Moreau-Yosida regularization, we introduce a method that avoids the need for differentiability. This is a significant improvement over previous approaches, that are limited to continuously differentiable cost functions. For solving matching and assignment problems, we introduce an approach that approximates large permutation matrices and reduces computation and memory costs by non-linear low-rank matrix factorization. We experimentally demonstrate its performance across various model- and learning-based methods.

DOI

10.25819/ubsi/10537

URN

nbn:de:hbz:467-27511

URI

https://dspace.ub.uni-siegen.de/handle/ubsi/2751

License

http://creativecommons.org/licenses/by/4.0/

File(s)