Citation Link: https://doi.org/10.1515/9783111574332
Smart Speaker im Dialog. Sprachliche Praktiken mit Voice User Interfaces
Alternate Title
Smart Speakers in Dialogue. Linguistic Practices with Voice User Interfaces
Source Type
Doctoral Thesis
Author
Subjects
Voice User Interfaces
Voice assistants
Media appropriation
Media linguistics
Human-machine interaction
DDC
400 Sprache, Linguistik
Source
Berlin, Boston: De Gruyter Brill, 2025. - ISBN 9783111573830
Issue Date
2025
Abstract
The monograph researches linguistic practices in the use of stationary voice-controlled digital assistants (smart speakers) such as Amazon’s Alexa, Google Home, or Apple’s Siri – all of which operate through so-called Voice User Interfaces (VUIs), i.e., voice-based interfaces between humans and machines.
Empirically, the study draws on video and audio recordings from real households, documenting both the initial setup and everyday use of smart speakers. Methodologically, the work is grounded in conversation analysis and enriched by multimodal video-based interaction analysis and ethnographic perspectives. The analysis focuses on both dyadic dialogues between humans and VUIs and more complex multi-party interactions. A praxeological understanding of language and media is central: language is conceptualized as an integral part of social practices, while interfaces are understood as situationally constituted between users and digital infrastructures. Drawing on domestication theory, the study also explores how users interact with these devices linguistically, how they integrate them into everyday routines, adapt to them—and how, in turn, the devices shape user practices.
The study’s key findings show that users orient themselves to established conversational routines when interacting with smart speakers. However, specific variations emerge: rigid sequential structures often need to be followed for successful execution of commands, which impacts how users address the device, manage turn-taking, and perform repairs. Users adapt to these structures—linguistic practices thus become interface practices. Nevertheless, especially in multi-party interactions, users frequently treat VUIs at the surface level as if they were conversational participants. Yet this attribution proves unstable and can shift from moment to moment. At times, linguistic cues directed at the smart speaker may instead target human participants. The users studied demonstrate linguistic competence in distinguishing between addressing a human and addressing a machine.
Empirically, the study draws on video and audio recordings from real households, documenting both the initial setup and everyday use of smart speakers. Methodologically, the work is grounded in conversation analysis and enriched by multimodal video-based interaction analysis and ethnographic perspectives. The analysis focuses on both dyadic dialogues between humans and VUIs and more complex multi-party interactions. A praxeological understanding of language and media is central: language is conceptualized as an integral part of social practices, while interfaces are understood as situationally constituted between users and digital infrastructures. Drawing on domestication theory, the study also explores how users interact with these devices linguistically, how they integrate them into everyday routines, adapt to them—and how, in turn, the devices shape user practices.
The study’s key findings show that users orient themselves to established conversational routines when interacting with smart speakers. However, specific variations emerge: rigid sequential structures often need to be followed for successful execution of commands, which impacts how users address the device, manage turn-taking, and perform repairs. Users adapt to these structures—linguistic practices thus become interface practices. Nevertheless, especially in multi-party interactions, users frequently treat VUIs at the surface level as if they were conversational participants. Yet this attribution proves unstable and can shift from moment to moment. At times, linguistic cues directed at the smart speaker may instead target human participants. The users studied demonstrate linguistic competence in distinguishing between addressing a human and addressing a machine.
File(s)![Thumbnail Image]()
Loading...
Name
Dissertation_Hector_Tim.pdf
Size
14.56 MB
Format
Adobe PDF
Checksum
(MD5):06c23d35ac4eda2d561705b5e667deed
Owning collection