Engineering class

Speech and Signal Processing Lab

Research Activities and Project Links

For publications see publications page.

Also see Dr. Johnson's research and publications pages.

Short Summaries of Recent Projects

Acoustic-Articulator Modeling for Pronunciation Analysis

(with Dr. Jeffrey Berry in the MU Speech and Swallowing Lab, Speech Pathology and Audiology)
Electromagnetic Articulography (EMA) for Computer Aided Language Learning
EMA-MAE: EMA database of Mandarin-Accented English

In order to support effective learning and provide specific, useful pronunciation feedback to users, Computer Aided Language Learning (CALL) systems for pronunciation correction must be able to capture pronunciation errors and accurately identify and describe errors in articulation. To do this, it is necessary to estimate articulator trajectory patterns from users’ acoustic data. Due to the difficulty of acoustic-articulator inversion and the complexities of inter-speaker differences in articulator patterns, this capacity is not yet well developed. Current systems are limited in the specificity of the corrective feedback that is provided, often only providing a “good versus bad” pronunciation match to the target and even at best only providing the general category of pronunciation error. This project, which has received recent initial funding from the NSF through the EAGER program, aims to address these key limitations through collection of a matched acoustic and five degree of freedom electromagnetic articulograph (EMA) data corpus for both native American English (L1) speakers and native Mandarin Chinese (L2) speakers who speak English as a second language. This has potential to be used for a variety of research efforts, including areas such as pronunciation variation modeling, acoustic-articulator inversion, L2-L1 speaker comparisons, pronunciation error detection, and corrective feedback for accent modification.

Intelligibility evaluation and enhancement

(with Dr. Jeffrey Berry in the MU Speech and Swallowing Lab, Speech Pathology and Audiology, and the Knowledge in Information and Discovery (KID) Lab)

In the past few decades, researchers have made substantial progress in developing methods for evaluation and enhancement of perceived speech quality in noisy environments. Despite this, there has not been similar progress in the area of speech intelligibility. It has been recently shown that while a great many different speech enhancement approaches give statistically significant improvements in perceived signal quality, none lead to statistically significant improvements in intelligibility in more than one noise environment. Current enhancement methods simply don’t improve signal intelligibility in a substantial way. It can be argued that the use of quality rather than intelligibility as a primary evaluation metric has led to misguided research directions, with incremental improvements to quality coming at the expense of intelligibility. We are currently working to address this issue by developing more accurate evaluation metrics for objectively estimating speech intelligibility, and in association with this to develop enhancement methods that use our understanding of perception and intelligibility that will more effectively improve signal intelligibility.

The Dr. Dolittle Project

(with Disney's Animal Kingdom, U. Connecticut Department of Animal Sciences and National Underwater Research Laboratory, and FAUNA Research Institute)

The fundamental goal of this research project is to develop a broadly useable framework for pattern analysis and classification of animal vocalizations, by integrating successful models and ideas from the field of speech processing and recognition into bioacoustics. Tasks include automatic vocalization classification and labeling, individual identification, call type classification, behavioral vocalization correlations, language acquisition, and seismic infrasonic communication. Species being targeted for study include domestic and agricultural animals, marine mammals, and several endangered species, in collaboration with researchers at a number of other institutions.

Acoustic censusing using automatic vocalization classification and identity recognition

(with Dr. Pete Scheifele, University of Cincinatti Fetch~Lab, University of Cincinatti Medical Center)

One of the outcomes of the Dolittle project was the realization that accurate individual identification was possible across a wide range of animal species, and that this could lead to significant improvements in methods for tasks like acoustic censusing, important for many vocal species that are difficult to visually census. This has led to several continuing projects and current proposal efforts to develop acoustic censusing methods based on speech processing and speaker identification technology.

Speech Recognition using Dynamical Systems

(with Knowledge in Information and Discovery (KID) Lab)

This research project focuses on applying state-of-the-art techniques for time-series modeling to the problem of characterizing speech signals. These time-series techniques combine state-space embedding methods and learning algorithms to create highly accurate non-linear models of a system's state. The time-delay embedding technique, taken from dynamical systems theory, is used to reconstruct the state spaces of the speech waveforms, which are characterized statistically and used to differentiate individual phonemes for isolated and continuous speech recognition.


COLLEGE OF ENGINEERING

Contact us

College of Engineering
1515 W. Wisconsin Ave.
Milwaukee, WI 53233

Dean's office: (414) 288-6591
Prospective students: (414) 288-7302
Academic Advising Center: (414) 288-6000
Contact form