Zymet (2018) – Lexical propensities in phonology: corpus and experimental evidence, grammar, and learning

Lexical propensities in phonology: corpus and experimental evidence, grammar, and learning
Jesse Zymet
direct link: http://ling.auf.net/lingbuzz/004346
December 2018
Traditional theories of phonological variation propose that morphemes be encoded with descriptors such as [+/- Rule X], to capture which of them participate in a variable process. More recent theories predict that morphemes can have LEXICAL PROPENSITIES: idiosyncratic, gradient rates at which they participate in a process—e.g., [0.7 Rule X]. This dissertation argues that such propensities exist, and that a binary distinction is not rich enough to characterize participation in variable processes. Corpus investigations into Slovenian palatalization and French liaison reveal that individual morphemes pattern across an entire propensity spectrum, and that encoding individual morphemes with gradient status improves model performance. Furthermore, an experimental investigation into French speakers’ intuitions suggests that they internalize word-specific propensities to undergo liaison. The dissertation turns to modeling language learners’ ability to acquire the idiosyncratic behavior of individual attested morphemes while frequency matching to statistical generalizations across the lexicon. A recent model based in Maximum Entropy Harmonic Grammar (MaxEnt) makes use of general constraints that putatively capture statistical generalizations across the lexicon, as well as lexical constraints governing the behavior of individual words. A series of learning simulations reveals that the approach fails to learn statistical generalizations across the lexicon: lexical constraints are so powerful that the learner comes to acquire the behavior of each attested form using only these constraints, at which point the general constraint is rendered ineffective. A GENERALITY BIAS is therefore attributed to learners, whereby they privilege general constraints over lexical ones. It is argued that MaxEnt fails to represent this property in its current formulation, and that it be replaced with the hierarchical MIXED-EFFECTS LOGISTIC REGRESSION MODEL (MIXED-EFFECTS MAXENT), which is shown to succeed in learning both a frequency-matching grammar and lexical propensities, by encoding general constraints as fixed effects and lexical constraints as a random effect. The learner treats the grammar and lexicon differently, in that vocabulary effects are subordinated to broad, grammatical effects in the learning process. For further developments, see my webpage: http://linguistics.berkeley.edu/~jzymet/

Format: [ pdf ]
Reference: lingbuzz/004346
(please use that when you cite this article)
Published in: UCLA Dissertation
keywords: variation, lexical propensities, maximum entropy harmonic grammar, hierarchical models, mixed-effects logistic regression, french, slovenian, morphology, phonology