Monthly Archives: June 2011

Keeping words out of your ears

This is the first post of what I hope will be a regular series of posts on topics in phonetics, phonology, linguistics, and any of the many things that I can connect to these topics. At times, I will indulge in polemic, but for the most part my purpose is to write informally about what I’m thinking about these topics. Comments and competing polemics are welcome!

Lately, I’ve been trying to work out how best to follow up experiments in which we’ve pitted the listeners’ application of their linguistic knowledge against an auditory process that may be linguistically naive.

The auditory process produces perceptual contrast between the target sound and a neighboring sound, its context. (See Lotto & Kluender, 1998 in Perception & Psychophysics for the first demonstration of such effects, and Lotto & Holt, 2006, also in P&P, for a general discussion of contrast effects. Contrast is the auditory alternative to compensation for coarticulation, see Fowler, 2006, for discussion and arguments against contrast as a perceptual effect. I’ll come back to the contrast versus compensation debate in future posts. For the time being, it’s enough that the context causes the target sound to sound different than the context. I’ll describe that effect as “contrast” but it could also be described as “compensation for coarticulation.”)

For example, we have shown that listeners are more likely to respond “p” to a stop from a [p-t] continuum following [i] than following [u]. They do so because [p] ordinarily concentrates energy at much lower frequencies in the spectrum than [t] does, while [i] concentrates it at much higher frequencies than [u] does. Thus, a stop whose energy concentration is half way between [t]’s high value and [p]’s low one will sound lower, i.e. more like [p], next to a sound like [i] that concentrates energy at high frequencies (and higher, more like [t] next to a sound like [u] that concentrates energy at low frequencies).

The linguistic knowledge our experiments tested is knowledge of what’s a word. That knowledge can either cooperate with this auditory effect, as for example when the preceding context is “kee_” [ki_], where “p” but not “t” makes a word, keep, or it can conflict, as when the preceding context is “mee_” [mi_] instead, where “t” and not “p” makes a word, meet.

We describe both effects as “biases” and distinguish them as “contrast” versus “lexical” biases.  In these stimuli, the preceding [i] or [u] is the source of the contrast bias, while the consonant preceding that vowel is the source of the lexical bias.  (The lexical bias is also known in the literature as the “Ganong” effect, after William Ganong, who first described it in a 1980 paper in the Journal of Experimental Psychology: Human Perception and Performance.)

All of our experiments so far have used materials like these, where the context that creates the contrast and the one that creates the lexical bias both occur in the same syllable. (The order of target sound, context sound, and the sound that determines the lexical bias have all been manipulated. If anyone wants to know, I can provide a full list of the stimuli.) Those experiments have shown that the two biases are effectively independent of one another.

Even so, we want to separate them in the stimuli, by delaying the moment when the lexical bias determines what word it is, that is, by delaying the lexical uniqueness point. For example, the uniqueness point in the word rebate is the vowel following the [b] (compare rebound), and the uniqueness point in redress is likewise the vowel following the [d] (compare reduce). So the listener would not know these words are rebate or redress until at least one sound later.

The [b] in rebate would contrast perceptually with [i] in the first syllable, while the [d] in redress would not. Would this contrast effect make the listener more likely to hypothesize that the next sound is [b] rather than [d]? If so, how could we test it? Right now, we’re considering a phoneme monitoring experiment, where we measure how quickly the listener responds that a “b” or “d” occurs in these words. If contrast increases the expectation of a [b], then listeners should be faster to respond “yes” to rebate and slower to respond “no” to redress when the sound they’re monitoring is [b]. The opposite effect would be expected if the preceding sound were [u] rather than [i] because then the [d] and not the [b] would contrast.

An alternative is an eyetracking experiment, where we show the two words on the screen, play one of them, and measure the probability and latency of first fixations to the two words as a function of whether the context and target contrast.

A whole host of questions come up (which is largely the reason for this post):

  1. Will this work even though the target sounds are unambiguous? One reason to be hopeful that it would is that we have eyetracking data showing contrast effects with unambiguous sounds — I’ll be posting on these at another time.
  2. Is phoneme monitoring the right task?
  3. Getting more to the heart of the problem, is the uniqueness point late enough that we’d effectively separate the lexical bias from the contrast bias?
  4. It won’t surprise you to learn that the lexicon of English is not perfectly designed for the purposes of this experiment. Among other problems, it’s hard to find: (a) equal numbers of words with all the combinations of vowel and consonant place we want (vowels: front versus back, consonants: coronal versus labial), (b) as noted, words with uniqueness points that are late enough, (c) words that contrast minimally up through the target sound and its context, (d) lists which are reasonably well-balanced for lexical statistics, (e) words that our likely participants, UMass-Amherst undergraduates are likely to know, etc. etc. The question here is: how much should any of this matter? Can’t we control these properties the best we can, while making sure we get enough items, and then include possible confounding factors in the model of the results?