PS: Introduction to Psycholinguistics
Winter Term 2005/06 Instructor: Daniel Wiechmann Office hours: Mon 2-3 pm Email:
[email protected] Phone: 03641-944534 Web: www.daniel-wiechmann.net
Session 4: Understanding speech Problems with recognition of speech
Segmentation problem (how to seperate sounds in speech)
Possible remedies:
Possible-word constraint Metrical segmentation strategy Stress-based segmentation Syllable-based segmentation
Session 4: Understanding speech Categorical perception
Experiment Liberman et al. (1957) Speech synthesizer creates continuum of artificial syllable that differ in the place of articulation of one phoneme Subjects placed syllables into three categories (/b/, /d/, /g/)
Session 4: Understanding speech Categorical perception
voice onset time (VOT) voiced and unvoiced consonants (e.g. /b/,/d/ vs /p/,/t/) differ with respect to VOT (difference ~ 60 ms) Experimenters varied VOT on a scale (e.g. 30ms) Subjects make ‚either-or‘ distinctions
Session 4: Understanding speech Categorical perception
Selective adaptation Repeated presentation of /ba/ makes people less sensitive to voicing feature (fatigue feature detector) cut-off point for /b/-/p/ destinction shifts toward /p/end of continuum
Session 4: Understanding speech Prelexical (phonetic) vs postlexical (phonemic) code
Prelexical code computed directly from perceptual analysis (bottom-up) Postlexical coded is computed from higher-level units such as words (top-down) Foss and Blank (1980) phoneme-monitoring task But cf. Foss and Gernsbacher (1983 and MarslenWilson and Warren (1994)
Session 4: Understanding speech In summary:
There is a controversy about whether or not we identify phonemes before we recognize higher level units (e.g. syllbles or words)
Session 4: Understanding speech The role of context in identifying sounds: the phonemic restoration effect (cf. Warren and Warren 1970)
Session 4: Understanding speech
It was found that the *eel was on the orange It was found that the *eel was on the axle It was found that the *eel was on the shoe It was found that the *eel was on the table
Session 4: Understanding speech
It was found that the peel was on the orange It was found that the wheel was on the axle It was found that the heel was on the shoe It was found that the meal was on the table
Understanding speech Phonemic restoration effect: 2 explanations
1. Context interacts directly with buttom-up processes (sensitivity effect) 2. Context may simply provide additional source of information (response bias effect)
Understanding speech: Samuel (1981, 1990) Method:
Subjects listen to sentences and meaningless noise was presented during each sentence On some trials, noise was superimposed on one of the phonemes of a word On other trials, phoneme was deleted Finally, sometimes phoneme was predicatble from context
Task
decide whether or not crucial phoneme had been presented
Understanding speech: Samuel (1981, 1990)
Phonemic restoration effect: 2 explanations Hypotheses 1. If context improves sensitivity, then the ability to dicriminate between phoneme plus noise and noise alone should be improved by predicatble context 2. If context affects response bias, then participants should simply be more likely to decide that the phoneme was presented when the word was presented in predictable context
Understanding speech: Samuel (1981, 1990)
Results:
Context affected response bias but not sensitivity Contextual information does not have a direct effect on bottom-up processing
Understanding speech: Models of speech recognition
Most influential models
Motor theory (Libermann et al 1967)
Listeners mimic the articulatory movements of the speaker
Cohort theory (Marslen-Wilsen and Tyler 1980) TRACE model (McClelland and Elman 1986)
Understanding speech: Models of speech recognition: neurons
Understanding speech: Models of speech recognition: neuron (schematic)
Synapse: The junction across which a nerve impulse passes from an axon terminal to a neuron
Understanding speech: Models of speech recognition: neuronal networks The brain is composed of over 10-100 billion nerve cells, or neurons, that communicate with one another through specialized contacts called synapses.
Typically, a single neuron receives 2000-5000 synapses from other neurons; these synapses are located almost exclusively on the neuron's dendrites, long projections that radiate out from the neuron's cell body. In turn, the neuron's axon, a long thin process that grows out from the cell body of a neuron, makes synaptic connections with 1000 other neurons. In this way, neuronal signals pass from neuron to neuron to form extensive and elaborate neural circuits.
Understanding speech: Models of speech recognition: number of neurons human brain
Understanding speech: Models of speech recognition: introducing connectionist models
Understanding speech: Models of speech recognition: introducing connectionist models
Two central assumptions artificial neural nets (ANN): 1) processing occurs through the action of many simple, interconnected processing units (neurons) 2) activation spreads around the network in a way determined by the strength of the links, i.e. the connections between units
Understanding speech: Models of speech recognition: introducing connectionist models Some models learn
back-propagation
Some don‘t
Interactive activation model (IAC) McClelland and Rumelhart (1981) does not learn TRACE model (McClelland and Elman 1986) is an IAC model
Understanding speech: Models of speech recognition: from neural networks to connectionist models Connections can be inhibitory or excitatory(facilitatory)
Connections (or links) have different weights
Threshold: the total amount of activation needed to make the node fire
Understanding speech: Models of speech recognition: from neural networks to connectionist models
+ 0.6 (excitatory) - 0.5 (inhibitory) + 0.7 (excitatory) -1 to +1
Threshold: 1.0
Ergo: no firing
Understanding speech: Models of speech recognition: from neural networks to connectionist models -1 to +1 - 0.5
+ 0.9 (excitatory) - 0.2 (inhibitory)
- 0.9
+ 0.4 (excitatory) -1 to +1
+ 0.5
Threshold: 1.0
Ergo: firing
Understanding speech: Models of speech recognition: from neural networks to connectionist models
Understanding speech: Models of speech recognition: from neural networks to connectionist models
Interactive activation network (McClelland and Rumelhart 1981)
Understanding speech: Models of speech recognition: TRACE
TRACE model (McClelland and Elman 1986)
There are individual processing units, or nodes, at three different levels: FEATURES (place & manner of production, voicing) PHONEMES WORDS
Understanding speech: Models of speech recognition: TRACE
TRACE model (McClelland and Elman 1986)
Feature nodes are connected to phoneme nodes Phoneme nodes are connected to word nodes
Connections between levels operate in both directions, and are only facilitatory (i.e. no inhibition)
Understanding speech: Models of speech recognition: TRACE
TRACE model (McClelland and Elman 1986)
There are connections among units or nodes at the same level These connections are inhibitory
Understanding speech: Models of speech recognition: TRACE
TRACE model (McClelland and Elman 1986)
Nodes influence each other in proportion to their activation levels and the strength of their interconnections As excitation and inhibition spread among nodes, a pattern of activation, or TRACE, develops
Understanding speech: Models of speech recognition: TRACE
TRACE model (McClelland and Elman 1986)
The word that is recognized is determined by the activation level of the possible candidate words.