Smart Speaker im Dialog. Sprachliche Praktiken mit Voice User Interfaces

Hector, Tim Moritz

doi:10.1515/9783111574332

Citation Link: https://doi.org/10.1515/9783111574332

Smart Speaker im Dialog. Sprachliche Praktiken mit Voice User Interfaces

Alternate Title

Smart Speakers in Dialogue. Linguistic Practices with Voice User Interfaces

Source Type

Doctoral Thesis

Author

Hector, Tim Moritz

Institute

DFG-Sonderforschungsbereich 1187 "Medien der Kooperation"

Subjects

Voice User Interfaces

Voice assistants

Media appropriation

Media linguistics

Human-machine interaction

DDC

400 Sprache, Linguistik

Source

Berlin, Boston: De Gruyter Brill, 2025. - ISBN 9783111573830

Issue Date

2025

Abstract

The monograph researches linguistic practices in the use of stationary voice-controlled digital assistants (smart speakers) such as Amazon’s Alexa, Google Home, or Apple’s Siri – all of which operate through so-called Voice User Interfaces (VUIs), i.e., voice-based interfaces between humans and machines.

Empirically, the study draws on video and audio recordings from real households, documenting both the initial setup and everyday use of smart speakers. Methodologically, the work is grounded in conversation analysis and enriched by multimodal video-based interaction analysis and ethnographic perspectives. The analysis focuses on both dyadic dialogues between humans and VUIs and more complex multi-party interactions. A praxeological understanding of language and media is central: language is conceptualized as an integral part of social practices, while interfaces are understood as situationally constituted between users and digital infrastructures. Drawing on domestication theory, the study also explores how users interact with these devices linguistically, how they integrate them into everyday routines, adapt to them—and how, in turn, the devices shape user practices.

The study’s key findings show that users orient themselves to established conversational routines when interacting with smart speakers. However, specific variations emerge: rigid sequential structures often need to be followed for successful execution of commands, which impacts how users address the device, manage turn-taking, and perform repairs. Users adapt to these structures—linguistic practices thus become interface practices. Nevertheless, especially in multi-party interactions, users frequently treat VUIs at the surface level as if they were conversational participants. Yet this attribution proves unstable and can shift from moment to moment. At times, linguistic cues directed at the smart speaker may instead target human participants. The users studied demonstrate linguistic competence in distinguishing between addressing a human and addressing a machine.

DOI

10.1515/9783111574332

URN

nbn:de:hbz:467-29203

URI

https://dspace.ub.uni-siegen.de/handle/ubsi/2920

License

http://creativecommons.org/licenses/by/4.0/

File(s)