Auditory image understanding for the visually impaired based on a modular computer vision sonification model

Banf, Michael

Citation Link: https://nbn-resolving.org/urn:nbn:de:hbz:467-7716

Auditory image understanding for the visually impaired based on a modular computer vision sonification model

Source Type

Doctoral Thesis

Author

Banf, Michael

Institute

Institut für Bildinformatik

Subjects

Computer Vision

Sonifikation

Computer Mensch Interaktion

Assistive Systeme

DDC

004 Informatik

GHBS-Clases

TVVC

TWK

Issue Date

2013

Abstract

This thesis presents a system that strives to give visually impaired people direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen or touch pad and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address those problems, based on a Modular Computer Vision Sonication Model, which we propose as a general framework for acquisition, exploration and sonication of visual information to support visually impaired people. General approaches are presented that combine low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classication of regions into the categories "man-made" versus "natural" based on a novel type of discriminative graphical model. We argue that this multi-level approach gives users direct access to the identity and location of objects and structures in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning. During exploration, the user can utilize detected man made structures or specic natural regions as reference points to classify other natural regions by their individual location, color and texture. We show that congenital blind participants employ that strategy successfully to interpret and understand whole scenes.

URN

nbn:de:hbz:467-7716

URI

https://dspace.ub.uni-siegen.de/handle/ubsi/771

License

https://dspace.ub.uni-siegen.de/static/license.txt

File(s)