Auditory and Visual Characteristics of Individual Talkers in Multimodal Speech Perception PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Auditory and Visual Characteristics of Individual Talkers in Multimodal Speech Perception PDF full book. Access full book title Auditory and Visual Characteristics of Individual Talkers in Multimodal Speech Perception by Corinne D. Anderson. Download full books in PDF and EPUB format.

Auditory and Visual Characteristics of Individual Talkers in Multimodal Speech Perception

Author: Corinne D. Anderson
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
Abstract: When people think about understanding speech, they primarily think about perceiving speech auditorily (via hearing); however, there are actually two key components to speech perception: auditory and visual. Speech perception is a multimodal process; i.e., combining more than one sense, involving the integration of auditory information and visual cues. Visual cues can supplement missing auditory information; for example, when auditory information is compromised, such as in noisy environments, seeing a talker's face can help a listener understand speech. Interestingly, auditory and visual integration occurs all of the time, even when the auditory and visual signals are perfectly intelligible. The role that visual cues play in speech perception is evidenced in a phenomenon known as the McGurk effect, which demonstrates how auditory and visual cues are integrated (McGurk and MacDonald, 1976). Previous studies of audiovisual speech perception suggest that there are several factors affecting auditory and visual integration. One factor is characteristics of the auditory and visual signals; i.e., how much information is necessary in each signal for listeners to optimally integrate auditory and visual cues. A second factor is the auditory and visual characteristics of individual talkers; e.g., visible cues such as mouth opening or acoustic cues such as speech clarity, that might facilitate integration. A third factor is characteristics of the individual listener; such as central auditory or visual abilities, that might facilitate greater or lesser degrees of integration (Grant and Seitz, 1998). The present study focused on the second factor, looking at both auditory and visual talker characteristics and their effect on auditory and visual integration of listeners. Preliminary results of this study show considerable variability across talkers in the auditory only condition, suggesting that different talkers have different degrees of auditory intelligibility. Interestingly, there were also substantial differences in the amount of audiovisual integration produced by different talkers that were not highly correlated with auditory intelligibility, suggesting talkers who have optimal auditory intelligibility are not the same talkers that facilitate optimal audiovisual integration.

Auditory and Visual Characteristics of Individual Talkers in Multimodal Speech Perception

Author: Corinne D. Anderson
Publisher:
ISBN:
Category :
Languages : en
Pages :

Visual and Auditory Factors Facilitating Multimodal Speech Perception

Author: Pamela Ver Hulst
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
Abstract: Speech perception is often described as a unimodal process, when in reality it involves the integration of multiple sensory modalities, specifically, vision and hearing. Individuals use visual information to fill in missing pieces of auditory information when hearing has been compromised, such as with a hearing loss. However, individuals use visual cues even when auditory cues are perfect, and cannot ignore the integration that occurs between auditory and visual inputs when listening to speech. It is well known that individuals differ in their ability to integrate auditory and visual speech information, and likewise that some individuals produce clearer speech signals than others, either auditorily or visually. Clark (2005) found that some talkers in a study of the McGurk effect, produced much stronger 'integration effects' than did other talkers. One possible underlying mechanism of auditory + visual integration is the substantial redundancy found in the auditory speech signal. But how much redundancy is necessary for effective integration? And what auditory and visual characteristics make a good integration talker? The present study examined these questions by comparing the auditory intelligibility, visual intelligibility, and the degree of integration for speech sounds that were highly reduced in auditory redundancy, produced by 7 different talkers. Performance of participants under four conditions: 1) degraded auditory only, 2) visual only, 3) degraded auditory + visual, and 4) non-degraded auditory + visual, was examined. Results indicate across-talker differences in auditory and auditory + visual intelligibility. Degrading the auditory stimulus did not affect the overall amount of McGurk-type integration, but did influence the type of McGurk integration observed.

Analysis of Talker Characteristics in Audio-visual Speech Integration

Author: Kelly Dietrich
Publisher:
ISBN:
Category :
Languages : en
Pages : 68

Book Description
Abstract: Speech perception is commonly thought of as an auditory process, but in actuality it is a multimodal process that integrates both auditory and visual information. In certain situations where auditory information has been compromised, such as due to a hearing impairment or a noisy environment, visual cues help listeners to fill in missing pieces of auditory information during communication. Interestingly, even when both auditory and visual cues are entirely comprehensible alone, both are taken into account during speech perception. McGurk and MacDonald (1976) demonstrated that listeners not only benefit from the addition of visual cues during speech perception in situations where there is a lack of auditory information, but also that speech perception naturally employs audio-visual integration when both cues are available. Although a growing body of research has demonstrated that listeners integrate auditory and visual information during speech perception, there is a significant degree of variability seen in the audio-visual integration and benefit of listeners. Grant and Seitz (1998) demonstrated that the variability in audio-visual speech integration is, in part, a result of individual listener differences in multimodal integration ability. We suggest that individual characteristics of both the auditory signal and talker might also influence the audio-visual speech integration process (Andrews, 2007; Hungerford, 2007; Huffman, 2007). Research from our lab has demonstrated a significant amount of variability in the performance of listeners on tasks of degraded auditory-only and audio-visual speech perception. Furthermore, these studies have revealed a significant amount of variability across different talkers in the degree of integration they elicit. The amount of information in the auditory signal clearly has an effect on audio-visual integration. However, in order to fully understand how different talkers and the varying information in the auditory signal impact audio-visual performance, an analysis of the speech waveform must be performed to directly compare acoustic characteristics with subject performance. The present study conducted a spectrographic analysis of the speech syllables of different talkers used in a previous perception study to evaluate individual acoustic characteristics. Based on behavioral confusion matrices that were made we were able to easily examine possible confusions demonstrated by listeners. Some of the behavioral confusions were easily explained by examining syllable formant tracks, while others were explained by the possibility that noise introduced into the waveform when the stimuli were degraded obscured subtle differences in the voice onset time of some confused syllables. Still other confusions were not easily explained by the analysis completed in the present study. The results of the present study provide the foundation for understanding aspects of the acoustic waveform and talker qualities that are desirable for optimal audio-visual speech integration and might also have implications for the design of future aural rehabilitation programs.

Auditory and Visual Information Facilitating Speech Integration

Author: Brandie Andrews
Publisher:
ISBN:
Category :
Languages : en
Pages :

Book Description
Abstract: Speech perception is often thought to be a unimodal process (using one sense) when, in fact, it is a multimodal process that uses both auditory and visual inputs. In certain situations where the auditory signal has become compromised, the addition of visual cues can greatly improve a listener's ability to perceive speech (e.g., in a noisy environment or because of a hearing loss). Interestingly, there is evidence that visual cues are used even when the auditory signal is completely intelligible, as demonstrated in the McGurk Effect, in which simultaneous presentation of an auditory syllable "ba" with a visual syllable "ga" results in the perception of the sound "da," a fusion of the two inputs. Audiovisual speech perception ability varies widely across listeners; individuals integrate different amounts of auditory and visual information to understand speech. It is suggested that characteristics of the listener, characteristics of the auditory and visual inputs, and characteristics of the talker may all play a role in the variability of audiovisual integration. The present study explored the possibility that differences in talker characteristics (unique acoustic and visual characteristics of articulation) might be responsible for some of the variability in a listener's ability to perceive audiovisual speech. Ten listeners were presented with degraded auditory, visual, and audiovisual speech syllable stimuli produced by fourteen talkers. Results indicated substantial differences in intelligibility across talkers under the auditory-only condition, but little variability in visual-only intelligibility. In addition, talkers produced widely varying amounts of audiovisual integration, but interestingly, the talkers producing the most audiovisual integration were not those with the highest auditory-only intelligibility.

Audiovisual Speech Processing

Author: Gérard Bailly
Publisher: Cambridge University Press
ISBN: 1107006821
Category : Computers
Languages : en
Pages : 507

Book Description
This book presents a complete overview of all aspects of audiovisual speech including perception, production, brain processing and technology.

Integrating Face and Voice in Person Perception

Author: Pascal Belin
Publisher: Springer Science & Business Media
ISBN: 1461435854
Category : Medical
Languages : en
Pages : 384

Book Description
This book follows a successful symposium organized in June 2009 at the Human Brain Mapping conference. The topic is at the crossroads of two domains of increasing importance and appeal in the neuroimaging/neuroscience community: multi-modal integration, and social neuroscience. Most of our social interactions involve combining information from both the face and voice of other persons: speech information, but also crucial nonverbal information on the person’s identity and affective state. The cerebral bases of the multimodal integration of speech have been intensively investigated; by contrast only few studies have focused on nonverbal aspects of face-voice integration. This work highlights recent advances in investigations of the behavioral and cerebral bases of face-voice multimodal integration in the context of person perception, focusing on the integration of affective and identity information. Several research domains are brought together. Behavioral and neuroimaging work in normal adult humans included are presented alongside evidence from other domains to provide complementary perspectives: studies in human children for a developmental perspective, studies in non-human primates for an evolutionary perspective, and studies in human clinical populations for a clinical perspective. Several research domains are brought together. Behavioral and neuroimaging work in normal adult humans included are presented alongside evidence from other domains to provide complementary perspectives: studies in human children for a developmental perspective, studies in non-human primates for an evolutionary perspective, and studies in human clinical populations for a clinical perspective. Several research domains are brought together. Behavioral and neuroimaging work in normal adult humans included are presented alongside evidence from other domains to provide complementary perspectives: studies in human children for a developmental perspective, studies in non-human primates for an evolutionary perspective, and studies in human clinical populations for a clinical perspective. Several research domains are brought together. Behavioral and neuroimaging work in normal adult humans included are presented alongside evidence from other domains to provide complementary perspectives: studies in human children for a developmental perspective, studies in non-human primates for an evolutionary perspective, and studies in human clinical populations for a clinical perspective.

The Handbook of Speech Perception

Author: Jennifer S. Pardo
Publisher: John Wiley & Sons
ISBN: 111918407X
Category : Language Arts & Disciplines
Languages : en
Pages : 784

Book Description
A wide-ranging and authoritative volume exploring contemporary perceptual research on speech, updated with new original essays by leading researchers Speech perception is a dynamic area of study that encompasses a wide variety of disciplines, including cognitive neuroscience, phonetics, linguistics, physiology and biophysics, auditory and speech science, and experimental psychology. The Handbook of Speech Perception, Second Edition, is a comprehensive and up-to-date survey of technical and theoretical developments in perceptual research on human speech. Offering a variety of perspectives on the perception of spoken language, this volume provides original essays by leading researchers on the major issues and most recent findings in the field. Each chapter provides an informed and critical survey, including a summary of current research and debate, clear examples and research findings, and discussion of anticipated advances and potential research directions. The timely second edition of this valuable resource: Discusses a uniquely broad range of both foundational and emerging issues in the field Surveys the major areas of the field of human speech perception Features newly commissioned essays on the relation between speech perception and reading, features in speech perception and lexical access, perceptual identification of individual talkers, and perceptual learning of accented speech Includes essential revisions of many chapters original to the first edition Offers critical introductions to recent research literature and leading field developments Encourages the development of multidisciplinary research on speech perception Provides readers with clear understanding of the aims, methods, challenges, and prospects for advances in the field The Handbook of Speech Perception, Second Edition, is ideal for both specialists and non-specialists throughout the research community looking for a comprehensive view of the latest technical and theoretical accomplishments in the field.

Perceiving Talking Faces

Author: Dominic W. Massaro
Publisher: MIT Press
ISBN: 9780262133371
Category : Language Arts & Disciplines
Languages : en
Pages : 524

Book Description
This book discusses the author's experiments on the use of multiple cues in speech perception and other areas and unifies the results through a logical model of perception.

Multimodality in Language and Speech Systems

Author: Björn Granström
Publisher: Springer Science & Business Media
ISBN: 9401723672
Category : Computers
Languages : en
Pages : 264

Book Description
This book is based on contributions to the Seventh European Summer School on Language and Speech Communication that was held at KTH in Stockholm, Sweden, in July of 1999 under the auspices of the European Language and Speech Network (ELSNET). The topic of the summer school was "Multimodality in Language and Speech Systems" (MiLaSS). The issue of multimodality in interpersonal, face-to-face communication has been an important research topic for a number of years. With the increasing sophistication of computer-based interactive systems using language and speech, the topic of multimodal interaction has received renewed interest both in terms of human-human interaction and human-machine interaction. Nine lecturers contri buted to the summer school with courses on specialized topics ranging from the technology and science of creating talking faces to human-human communication, which is mediated by computer for the handicapped. Eight of the nine lecturers are represented in this book. The summer school attracted more than 60 participants from Europe, Asia and North America representing not only graduate students but also senior researchers from both academia and industry.

Trends in Experimental Psychology Research

Author: Diane T. Rosen
Publisher: Nova Publishers
ISBN: 9781594544644
Category : Psychology
Languages : en
Pages : 304

Book Description
This new book includes within its scope original research on basic processes of cognition, learning, memory, imagery, concept formation, problem-solving, decision-making, thinking, reading, and language processing.