Jill Thorston of Northeastern University will be giving a talk titled “The development of intonation and information structure in early speech perception and production” this Tuesday, 24 March at 5:30 pm in Herter Hall room 217. The abstract follows.
“The development of intonation and information structure in early speech perception and production”
Infants are born with sensitivities to their native language s melody and rhythm. This attunement to prosody affects language development over the first years of life, and impacts early attentional processing, word learning, and speech production. The motivation for the first line of research is to investigate how American English-acquiring toddlers are guided by the mapping between intonation and information structure during on-line reference resolution and novel word learning. Specifically, I ask how specific pitch movements (deaccented/H*/L+H*) systematically predict patterns of attention and subsequent novel word learning abilities depending on the referring or learning condition (new/given/contrastive). Results show that the presence of either newness or a pitch accent facilitates attention, and that toddlers learn better from more prominent learning conditions. A second line of research examines the phonological and phonetic realizations of information categories as produced by toddler and adult speakers of English. During a spontaneous speech task designed as an interactive game, a set of target nouns are labeled and analyzed as new, given, or contrastive. Results reveal that toddlers reflect adult phonological patterns for new and contrastive information, as well as demonstrate a sophisticated usage of the acoustic correlates of intonation. Together, this set of studies demonstrates how higher-level components combine to direct attention to a referent in discourse and how this process helps explain mechanisms that are important for novel word learning and early speech production.
Vincent Homer and Rajesh Bhatt of the UMass Linguistics department will be giving a talk titled “Move Something” this Friday, 27 March at 3:30pm in ILC N400 (abstract below). All are welcome.
Typically, PPIs cannot be interpreted in the scope of a clausemate negation (barring shielding and rescuing). This means that when a given PPI is such that its scope is uniquely determined by its surface position, as is the case with e.g.would rather, the effect of putting it under a clause-mate negation is plain ungrammaticality. With indefinites, such as some, things are different: they can appear in that same configuration, provided that they are interpreted with wide scope over negation, which, in their case, is an available option.
In fact, indefinites are independently known to be able to take free wide scope: it is thus a priori possible that a mechanism whereby indefinite PPIs escape out of anti-licensing environments is the same that gives them wide scope out of syntactic islands, i.e. they can be interpreted by choice functions. In this talk, we address the question of the nature of the mechanism at play when, for polarity purposes, elements take wider scope than where they appear on the surface. We present arguments from Hindi-Urdu that, when a PPI surfaces in an anti-licensing environment, the wide scope mechanism that salvages it is movement (overt in Hindi-Urdu), not existential closure of a function-variable.
Gaja Jarosz of Yale University will be giving a job talk in the Linguistics department this Friday, 13 March at 3:30pm in ILC N400. Her talk is titled “Sonority Sequencing Effects in Polish: Defying the Stimulus?” (abstract below). All are welcome.
Sonority Sequencing Effects in Polish: Defying the Stimulus?
The Sonority Sequencing Principle (SSP: Steriade 1982; Selkirk 1984; Clements 1988, 1992) states that syllables with a sonority rise in the transition from the onset to the nucleus are preferred cross-linguistically. Experimental evidence indicates that English speakers exhibit gradient sensitivity to the SSP for onset clusters that are not attested in English (Davidson 2006, 2007; Berent et al. 2007, 2009; Daland et al. 2011). Berent et al. (2007, 2009) show that several lexical statistics of English fail to predict these preferences and suggest that the principle may therefore be innate. However, Daland et al. (2011) show that computational models with the ability to form abstract generalizations on the basis of phonological features and phonological context can detect SSP preferences on the basis of English lexical statistics. In this talk, I explore this controversy using computational and developmental approaches in a language (Polish) with very different sonority sequencing patterns from English. Using computational modeling, I show that a) the lexical statistics of Polish contradict the SSP, b) computational models applied to input estimated from Polish child-directed speech predict reverse-SSP preferences, and c) computational models that encode the SSP straightforwardly predict earlier acquisition of clusters with higher sonority rises. Thus, Polish provides a rare example where predictions of input-based models, even phonologically sophisticated ones, diverge dramatically from predictions expected on the basis of universal principles. I test these predictions by examining the acquisition of onset clusters in Polish. The data come from the spontaneous speech of four typically-developing, monolingual, Polish children aged 1;7-2;6 in the Weist-Jarosz Corpus (Weist and Witkowska-Stadnik 1986; Weist et al. 1984; Jarosz 2010; Jarosz et al. submitted). In conflict with the input-based predictions, the acquisition analyses indicate that development is significantly and gradiently sensitive to the SSP. I discuss the implications for phonological theory.
Marek Petrik of IBM’s T.J. Watson Research Center will be speaking in the Machine Learning and Friends lunch this Thursday, 12 March at 12:00pm in CS 150. His talk is titled “Better Solutions From Inaccurate Models” (abstract below).
Better Solutions From Inaccurate Models
It is very important in many application domains to compute good solutions from inaccurate models. Models in machine learning are inaccurate because they both simplify reality and are based on imperfect data. Robust optimization has emerged as a very powerful methodology for reducing solution sensitivity to model errors. In the first part of the talk, I will describe how robust optimization can mitigate data limitations in planning a large-scale disaster recovery operation. In the second part of the talk, I will discuss a novel use of robustness to substantially reducing error due to model simplification in reinforcement learning and large-scale regression.
Bishan Yang, a PhD candidate from Cornell University, will be speaking at the Machine Learning and Friends lunch on Tuesday, 10 March at 12:00pm in CS 150. Her talk is titled “Exploiting Relational Knowledge for Extraction of Opinions and Events in Text” (abstract below).
Exploiting Relational Knowledge For Extraction Of Opinions And Events In Text
The richness and diversity of natural language makes automatic extraction of opinions and events from texts difficult. An automatic system designed for this task would need to identify complex linguistic expressions, interpret their meanings in context, and integrate information that is often distributed over long distances. While machine learning techniques have been widely applied for information extraction, they often make strong independence assumptions about linguistic structure and make decisions myopically based on local and partial information in the text. In this talk, I argue that accurate information extraction needs machine learning algorithms that can exploit relationships within and across multiple levels — between words, phrases and sentences — facilitating globally-informed decisions.
In the first part of my talk, I will introduce the task of fine-grained opinion extraction — discovering opinions, their sources, targets and sentiment from text. I will present a joint inference approach that can account for the dependencies among different opinion entities and relations, and a context-aware learning approach that is capable of exploiting intra- and inter-sentential discourse relations for improving sentiment prediction. In the second part of my talk, I will present my recent work on event extraction and event coreference resolution — the task of extracting event mentions and integrating them within and across documents by exploiting context. I propose a novel Bayesian model that allows generative modeling of event mentions, while simultaneously accounting for event-specific similarity.
Eric Bakovic of UCSD will be giving a job talk titled “Ensuring the proper determination of identity: a model of possible constraints” (abstract below) this Friday, March 6 in the Linguistics department at 3:30 pm in ILC N400. All are welcome to attend.
“Ensuring the proper determination of identity: a model of possible constraints”
Some phonological patterns can be described as sufficient identity avoidance, where ‘sufficiently identical’ means ‘necessarily identical with respect to all but some specific feature(s)’. The first part of the talk addresses this question: why are specific features ignored for the purposes of determining sufficient identity? In previous work (Bakovic 2005, Bakovic & Kilpatrick 2006, Pajak & Bakovic 2010, Brooks et al. 2013ab), we have found that patterns of sufficient identity avoidance where a specific feature F is ignored also involve F-assimilation in the same contexts. Direct reference to sufficient identity is thus unnecessary: sufficient identity is indirectly avoided because F-assimilation would otherwise be expected, resulting in total identity. Avoiding sufficient identity without assimilation is the better option, as predicted by the minimal violation property of Optimality Theory. This analysis predicts rather than stipulates the features that will be ignored for the purposes of determining sufficient identity. (Several corollary consequences of the analysis will also be discussed in the talk.) The explanatory value of the analysis, however, is predicated on the absolute non-existence of constraints directly penalizing all-but-F identity, which could be active independently of F-assimilation. The second part of the talk addresses this question: how can such constraints be ruled out formally? I propose a deterministic model of constraint construction and evaluation that results in just the types of constraints necessary for the analysis above. More broadly, the proposed model is intended as a contribution to our formal understanding of what a ‘possible constraint’ is.
Karthik Raman, a PhD student at Cornell University working with Prof. Thorsten Joachims, will be speaking at the Machine Learning and Friends lunch this Thursday, March 5 at 12:30 pm in CS 150. His talk is titled “Man + Machine: Machine Learning with Humans-in-the-loop” (abstract below).
“Man + Machine: Machine Learning with Humans-in-the-loop”
Intelligent systems, ranging from internet search engines and online retailers to personal robots and MOOCs, live in a symbiotic relationship with their users – or at least they should. On the one hand, users greatly benefit from the services provided by these systems. On the other hand, these systems can greatly benefit from the world knowledge that users communicate through their interactions with the system. These interactions — queries, clicks, votes, purchases, answers, demonstrations, etc. — provide enormous potential for economically and autonomously optimizing these systems and for gaining unprecedented amounts of world knowledge required to solve some of the hardest AI problems.
In this talk I discuss the challenges of learning from data that results from human behavior. I will present new machine learning models and algorithms that explicitly account for the human decision making process and factors underlying it such as human expertise, skills and needs. The talk will also explore how we can look to optimize human interactions to build robust learning systems with provable performance guarantees. I will also present examples, from the domains of search, recommendation and educational analytics, where we have successfully deployed systems for cost-effectively learning with humans in the loop.
Kie Zuraw of UCLA will be giving a job talk titled “Polarized Variation” (abstract below) in the Linguistics department on Friday, 20 February at 3:30 pm in ILC N400. All are welcome to attend.
The normal distribution–the bell curve–is common in all kinds of data, and is often expected when the quantity being measured results from multiple independent factors. The distribution of phonologically varying words, however, is sharply non-normal in the cases examined in this talk (from English, French, Hungarian,Tagalog, and Samoan). Instead of most words’ showing some medial rate of variation (say, 50% of a word’s tokens are regular and 50% irregular), with smaller numbers of words having extreme behavior, words cluster at the extremes of behavior–that is, a histogram of exceptionality rates is shaped like a U (or sometimes J) rather than a bell. The U shape cannot be accounted for by positing a binary distinction with some amount of noise over tokens, because some items (though the minority) clearly are variable, even speaker-internally. In some cases (e.g., French “aspirated” words) there is a diachronic explanation: sound change caused some words to become exceptional, so that the starting point for today’s situation was already U-shaped. But in other cases, such an explanation is not available, and items seem to be attracted towards extreme behavior.
Two mechanisms for deriving U-shaped distributions will be presented, with some speculation as to why some distributions of variation are U-shaped and others bell-shaped.
Gillian Gallagher of NYU will be giving a job talk titled Natural Classes in Phonotactic Learning (abstract below) in the Linguistics department on Friday, 20 February at 3:30 pm in ILC N400. All are welcome to attend.
Natural classes in phonotactic learning
The core representational unit in phonology is the feature, used to
define contrasts between sound categories (/i/ and /e/ are
distinguished by [±high]) and to pick out classes of sounds that
pattern together in the phonology ([+high] vowels may be restricted
from final position in some languages). Traditionally, phonological
features are thought to bear a direct relation to phonetic properties
(Jakobson, Fant & Halle 1952; Chomsky & Halle 1968). Under more recent
proposals, though, features are labels for phonologically active
classes that may bear a loose or no relation to the phonetics of the
sounds in question (Mielke 2008). In this talk, I present evidence
that phonetics plays a direct role in the natural classes used in the
The cooccurrence phonotactics of Quechua provide evidence for natural
classes grouping aspirated stops with the glottal fricative [h], and
grouping ejective stops with the glottal stop [?]. In addition to
being phonologically active, both of these classes are phonetically
definable based on articulatory properties of the glottis: [spread
glottis] picks out aspirates and [h], [constricted glottis] picks out
ejectives and [?]. Despite the phonological and phonetic support, two
nonce word tasks fail to find evidence for these natural classes in
speakers’ grammars. Instead, aspirate and ejective stops seem to be
targeted by the phonotactics to the exclusion of their glottal
counterparts. It is proposed that the preference for these smaller
classes of laryngeally marked stops is phonetically based, deriving
from the salience of the acoustic properties unique to stops.
Ari Kobren of the UMass CS department will be speaking in the Machine Learning and Friends lunch in CS 150 on Thursday, 19 February at 12:30 pm. Everyone is welcome.