Citation Link: https://nbn-resolving.org/urn:nbn:de:hbz:467-7716
Auditory image understanding for the visually impaired based on a modular computer vision sonification model
Source Type
Doctoral Thesis
Author
Institute
Issue Date
2013
Abstract
This thesis presents a system that strives to give visually impaired people direct perceptual access to images via an acoustic signal. The user explores the image actively on a touch screen or touch pad and receives auditory feedback about the image content at the current position. The design of such a system involves two major challenges: what is the most useful and relevant image information, and how can as much information as possible be captured in an audio signal. We address those problems, based on a Modular Computer Vision Sonication Model, which we propose as a general framework for acquisition, exploration and sonication of visual information to support visually impaired people. General approaches are presented that combine low-level information, such as color, edges, and roughness, with mid- and high-level information obtained from Machine Learning algorithms. This includes object recognition and the classication of regions into the categories "man-made" versus "natural" based on a novel type of discriminative graphical model. We argue that this multi-level approach gives users direct access to the identity and location of objects and structures in the image, yet it still exploits the potential of recent developments in Computer Vision and Machine Learning. During exploration, the user can utilize detected man made structures or specic natural regions as reference points to classify other natural regions by their individual location, color and texture. We show that congenital blind participants employ that strategy successfully to interpret and understand whole scenes.
File(s)![Thumbnail Image]()
Loading...
Name
banf.pdf
Size
43.52 MB
Format
Adobe PDF
Checksum
(MD5):e6b4eb30cbade6e442e5aed4c2089f38
Owning collection