The next cognitive brown bag speaker (3/27, 12:00, Tobin 521B) is Patrick Sadil (https://www.umass.edu/pbs/people/patrick-sadil). Title and abstract are below. All are welcome.
A (largely) hierarchical Bayesian model for inferring neural tuning functions from voxel tuning functions
Inferring neural properties from the hemodynamic signal provided by fMRI remains challenging. It is tempting to simply assume that the dynamics of individual neurons or neural subpopulations is reflected in the hemodynamic signal, and in apparent support of this assumption important features of neural activity — such as the ‘tuning’ to different stimulus features (e.g., the pattern of activity in response to different orientations, colors, or motion) — are observable in fMRI. However, fMRI measures the aggregated activity of a heterogeneous collection of neural subpopulations, and this aggregated activity may mislead inferences about the behavior of each individual subpopulation. In particular, extant analysis methods can lead to erroneous conclusions about how neural tuning functions are altered by interactions between stimulus features (e.g., changes in the contrast of a stimulus), or between the tuning curves and different cognitive states (e.g., with or without attention). I will present a statistical modeling approach that attempts to remove these limitations. The approach is validated by using it to infer alterations to neural tuning curves from fMRI data, in a circumstance where the ground truth of the alteration has been provided by electrophsyiology.
who: Ishan Misra (Facebook AI Research, NY)
when: 03/28 (Thursday) 11:45a – 1:15p
where: Computer Science Building Rm 150
food: Athena’s Pizza
“Scaling Self-supervised Visual Representation Learning“
Abstract: Self-supervised learning aims to learn representations from the data itself without explicit manual supervision. Existing efforts ignore a crucial aspect of self-supervised learning – the ability to scale to large amount of data because self-supervision requires no manual labels. In this work, we revisit this principle and scale two popular self-supervised approaches to 100 million images. Scaling these methods also provides many interesting insights into the limitations of current self-supervised techniques and evaluations. We conclude that current self-supervised methods are not complex enough to take full advantage of large scale data and do not seem to learn effective high level semantic representations. Finally, we show how scaling current self-supervised methods provides state-of-the-art results that sometimes match or surpass supervised representations on tasks such as object detection, surface normal estimation and visual navigation.
Bio: Ishan is a Research Scientist at Facebook AI Research. He graduated from Carnegie Mellon University where his PhD thesis was titled “Visual Learning with Minimal Human Supervision” and got the Runner Up SCS Distinguished Dissertation Award. This work was about learning recognition models with minimal supervision by exploring structure and biases in the labels (multi-task), classifiers (meta learning) and data (self supervision). His current research interests are in self supervised approaches, understanding vision and language models, and in compositional models for small sample learning.
Website – http://imisra.github.io/
The cognitive brown bag speaker on Wednesday, March 20 will be Mohit Iyyer of UMass Computer Science (https://people.cs.umass.edu/~miyyer/). Title and abstract are below. As always, the talk is in Tobin 521B at 12:00. All are welcome.
Title: Towards Understanding Narratives with Artificial Intelligence
One of the fundamental goals of artificial intelligence is to build computers that understand language at a human level. Recent progress towards this goal has been fueled by deep learning, which represents words, sentences, and even documents with learned vectors of real-valued numbers. However, creative language—the sort found in novels, film, and comics—poses an immense challenge for such models because it contains a wide range of linguistic phenomena, from phrasal and sentential syntactic complexity to high-level discourse structures such as narrative and character arcs. In this talk, I discuss our recent work on applying deep learning to creative language understanding, as well as exploring the challenges that must be solved before further progress can be made. I begin with an general overview of deep learning before presenting model architectures for two tasks involving creative language understanding: 1) modeling dynamic relationships between fictional characters in novels, and 2) predicting dialogue and artwork from comic book panels. For both tasks, our models only achieve a surface-level understanding, limited by a lack of world knowledge, an inability to perform commonsense reasoning, and a reliance on huge amounts of data. I conclude by proposing ideas on how to push these models to produce deeper insights from creative language that might be of use to humanities researchers.
Rebecca Morley (OSU will present a colloquium in Linguistics on Friday March 8th at 3:30 in ILC N400. All are welcome!
Title: Phonological contrast as an evolving function of local predictability
In this talk I conceptualize phoneme identification as the result of a phonological parse that maps acoustic input to a series of discrete abstract structures. As has been proposed for syntactic processing, the phonological parse is built up incrementally as the speech signal is received, and the highest-probability parse available is selected at each point. As the parse proceeds, listener expectations develop regarding future input. If those expectations fail to be met, the phonological parser can be “garden-pathed” just as a syntactic parse would be. The primary difference between the two domains is that the input to the syntactic parser is typically
assumed to consist of already segmented sequences of words, and the induction problem is one of determining the hierarchical groupings among those words. The input to the phonological parser, on the other hand, will be assumed to consist of a stream of continuously
valued acoustic cues, and the induction problem to be literal segmentation: attributing perceived cues to sequentially ordered discrete segments.
This proposal is illustrated through a re-analysis of the well-known, and well-researched, phenomenon of vowel lengthening in American English. I will argue that no actual lengthening
of vowels before voiced obstruents occurs (nor shortening before voiceless obstruents), but that the effect is an epiphenomenon of speaking rate and prosodic lengthening. I take the results of production experiments to argue for an underlying specification of /short/ for English “voiced” obstruents. And I show that the categorical perception results (in which vowel duration is found to be a sufficient cue to “voicing” on word-final obstruents) can be derived from general properties of the proposed phonological parser. The implications for theories of contrast, diagnostics of contrastive features, and theories of sound change will be discussed.
The next cognitive brown bag is Weds. 3/5 at 12:00 in Tobin 521B. The speaker is Jon Burnsky (UMass PBS); title and abstract are below.
What does it mean to predict a word and what can predictions tell us?
I will present data from three experiments investigating prediction in language comprehension. First, I will discuss an eyetracking experiment providing suggestive (though inconclusive) evidence that predicted words that are not encountered are activated similarly to words that are actually encountered. Then, I will discuss two experiments using the cloze task in which comprehenders’ predictions are used as a tool to probe their syntactic or thematic representations of complex sentences. The results suggest that non-veridical representations are computed online when doing so yields a more plausible interpretation of the sentence, perhaps by way of Bayesian inference on the part of the comprehender.
Steven Foley (UCSC) will present “Why are ergatives hard to process? Reading-time evidence from Georgian” in ILC N400 at 3:30. All are welcome!
ABSTRACT: How easily a filler–gap dependency is processed can depend on the syntactic position of its gap: in many languages, for example, subject-gap relative clauses are generally easier to process than object-gap relatives (Kwon et al. 2013). One possible explanation for this is that certain syntactic positions might be intrinsically more accessible for extraction than others (Keenan & Comrie 1977). Alternatively, processing difficulty might correlate with the relative informativity of morphosyntactic cues (e.g., case) ambient to the gap (Polinsky et al. 2012; cf. Hale 2006). Ergative languages are ideal for disentangling these two theories, since they decouple case morphology (ergative ~ absolutive) and syntactic role (subject ~ object). This talk presents reading-time data from Georgian, a split-ergative language, which suggests that case may indeed be a crucial factor affecting real-time comprehension. Across four self-paced reading experiments, ergative DPs in different configurations are read consistently slower than absolutive ones — bearing out the predictions of the informativity hypothesis. However, the case is not closed: it seems that accusative morphology, at least in Japanese and Korean, does not seem to be associated with a processing cost, even though it is just as informative as ergative is. To reconcile this ergative–accusative processing asymmetry, I turn to the debate in formal syntax between different modalities of case assignment, and argue that a theory in which case is assigned by functional heads (Chomsky 2000, 2001) gives us better traction for understanding both Georgian-internal and crosslinguistic processing data than does a configurational theory of case (Marantz 1991).