Rich Lexicons and Restrictive Grammars – Maximum Likelihood Learning in Optimality Theory

Jarosz, Gaja. 2006. Rich Lexicons and Restrictive Grammars – Maximum Likelihood Learning in Optimality Theory. PhD dissertation, Johns Hopkins University. Rutgers Optimality Archive #884.


This dissertation undertakes the full formal problem of phonological learning – the learning of phonological lexicons and restrictive phonological grammars that assign hidden structure given only overt (unstructured) phonological forms with associated morphemes (e.g. <[dz], DOG+PLURAL>). A major challenge is the learning of grammars that are simultaneously restrictive and have generalizing capacity, two contradictory requirements. The proposed solution, Maximum Likelihood Learning of Lexicons and Grammars (MLG), combines a probabilistic formulation of Optimality Theory (Prince and Smolensky 1993/2004) with statistical learning via likelihood maximization.

Chapter 2 introduces the proposed theory of phonological learning, whose central premise is that the correct grammar and lexicon combination makes the overt forms most likely, given richness of the base. The generalizing capacity of grammars is attributed to formal linguistic theory, in particular to implicational markedness universals. The identification of restrictive grammars is the consequence of maximum likelihood learning in conjunction with explicit reliance on richness of the base.

Chapter 3 proposes EMGL, a possible algorithm for learning within MLG. EMGL is a variant of the well-known Expectation-Maximization algorithm. This procedure is demonstrated to successfully learn general, restrictive grammars and correct lexicons in a variety of artificial language systems with different kinds of hidden structure: syllable structure, yer vowels, voicing neutralization, and free variation.

While the ability of MLG to identify restrictive grammars is due to reliance on likelihood maximization, its ability to identify grammars with generalizing capacity depends on the formal linguistic system it incorporates. Chapter 4 focuses on typological variation in the domain of syllable structure, extending the formal linguistic system to account for apparent implicational markedness inconsistencies in this domain. The novel theory of syllable structure and sonority restrictions, Headed Feature Domain Syllable Theory, is applied to Polish, accounting for a variety of complex sonority restrictions and building a foundation for the modeling of the acquisition of syllable structure.

The predictions of MLG for the process of acquisition are discussed in Chapter 5. The proposed learning theory builds a foundation for the computational modeling of child phonological acquisition, accounting for the end states of two stages of acquisition, phonotactic learning and morphophonemic learning, as well as the gradual transition between these stages. Markedness is predicted to be the primary constraint on possible acquisition paths, with frequency playing a secondary role. The thesis presents stage- by-stage predictions for the acquisition of onsets of varying sonority and complexity in Polish based on observed frequencies of various onsets in Polish.

Download from ROA