Coral Hughto has been working since September as a Data Engineer at Assurance, a company in Seattle that provides a platform for buying insurance. Congratulations Coral!
Category Archives: Computational linguistics
Linzen colloquium Friday April 17 at 3:30
Tal Linzen, Johns Hopkins University, will present “What inductive biases enable human-like syntactic generalization?” in the Linguistics zolloquium series at 3:30 Friday April 17. An abstract follows. All are welcome! The Zoom link has already beensent out on department mailing lists. If you did not receive it and would like attend, please email Brian Dillon for the link.
Anderson to Wellesley
Congratulations are in order for Carolyn Anderson, who has accepted a tenure-track assistant professor position at Wellesley College in the Department of Computer Science. She’ll begin in the Fall of 2021. Best of luck at Wellesley, Carolyn! We’re proud of you and we look forward to seeing what you get up to out there.
Jarosz in Warsaw
On February 6th, Gaja Jarosz gave an invited talk at the Old World Conference in Phonology in her hometown of Warsaw, Poland. The talk, titled “Embracing Ambiguity: Quantitative Modeling in Phonology“, was live-streamed on YouTube and also made available on StreamGram.
Prickett in Phonology
Brandon Prickett has just published “Learning biases in opaque interactions” in the latest issue of Phonology. Congratulations Brandon!
This study uses an artificial language learning experiment and computational modelling to test Kiparsky’s claims about Maximal Utilisation and Transparency biases in phonological acquisition. A Maximal Utilisation bias would prefer phonological patterns in which all rules are maximally utilised, and a Transparency bias would prefer patterns that are not opaque. Results from the experiment suggest that these biases affect the learnability of specific parts of a language, with Maximal Utilisation affecting the acquisition of individual rules, and Transparency affecting the acquisition of rule orderings. Two models were used to simulate the experiment: an expectation-driven Harmonic Serialism learner and a sequence-to-sequence neural network. The results from these simulations show that both models’ learning is affected by these biases, suggesting that the biases emerge from the learning process rather than any explicit structure built into the model.
UMass at RecPhon 2019
Many UMass folks past and present were at RecPhon 2019: Recursivity
in phonology below and above the word, 21-22 November 2019, Universitat Autònoma de Barcelona, Bellaterra. A number of former UMass visitors were co-organizers: Eulàlia Bonet, Joan Mascaró, Francesc Torres-Tamarit.
Invited speakers and UMass alumni Junko Ito and Armin Mester presented Recursivity in phonology below the word, while invited speaker and UMass alumna Emily Elfner presented Match Theory and Recursion below and above the word: Evidence from Tlingit. Faculty member Kristine Yu presented Computational perspectives on phonological constituency and recursion and graduate student Leland Kusmer presented Minimal prosodic recursion in Khoekhoegowab. Former visitor Gorka Elordieta presented joint work with emeritus faculty member Lisa Selkirk: Phrasing unaccented words in a recursive prosodic structure in Basque.
SENSUS at UMass, April 18-19, 2020
UMass is hosting “Sensus: Constructing meaning in Romance” on April 18-19, 2020. This is a conference on the formal semantics and pragmatics of Romance languages.
Areas: theoretical semantics and pragmatics and their interfaces with other domains, experimental methodologies, fieldwork, the study of variation and computational approaches
Venue: Integrative Learning Center at UMass Amherst (the ILC is a fully accessible building)
(UC, Santa Cruz)
Organizers: Ana Arregui, María Biezma, Vincent Homer and Deniz Özyıldız
Event sponsored by the Department of Linguistics and the Department of Languages, Literatures and Cultures of UMass Amherst
Contact us at email@example.com
Details can be found here: http://blogs.umass.edu/sensus/
David Smith talk, Monday Nov 18
David Smith (https://www.khoury.northeastern.edu/people/david-smith/) will present “Textual Criticism as Language Modeling: Viral Texts, Networked Authors, and Computational Models of Information Propagation” at 4 pm Monday Nov. 18th in ILC N400. An abstract is below.
This presentation is to a joint meeting of the Initiative for Data Science in the Humanities, and the Data Science tea. If you have any questions, contact Joe Pater at firstname.lastname@example.org. David will be available for half hour meetings from 1 – 3:30 in the Linguistics department – sign up here.
The era of mass digitization seems to provide a mountain of source material for scholarship, but its foundations are constantly shifting. Selective archiving and digitization obscures data provenance, metadata fails to capture the presence of texts of mutable genres and uncertain authorship embedded within the archive, and automatic optical character recognition (OCR) transcripts contain word error rates above 30% for even eighteenth-century English. The condition of the mass-digitized text is thus closer to the manuscript sources of an edition than to a scholarly publication. On the computational side, models that treat collections as sets of independent documents fail to capture the processes by which new texts are generated from existing ones.
In this talk, I will discuss several aspects of our work on “speculative bibliography” with computational methods. Starting from a simple model of the composition of historical newspaper pages, with applications to text denoising, I describe models of how texts transform their sources, applied to modern science journalism, medieval Arabic historians, and the generically hybrid forms in nineteenth-century newspapers. I conclude by discussing methods for inferring network structure and mapping information propagation among texts and publications.
This is joint work with Ryan Cordell, Rui Dong, Ansel MacLaughlin, Abby Mullen, Ryan Muther, and Shaobin Xu.
Graf colloquium Friday Nov 8 at 3:30
Thomas Graf, Stony Brook University, will present “Subregular linguistics for linguists” in the Linguistics colloquium series at 3:30 Friday Nov 8. An abstract follows. All are welcome!
Drawing from computational work that is known as the subregular program, I will argue against two received views in linguistics: “phonology and syntax are very different’ and “subcategorization is a solved problem”.
- Cognitive parallelism
Subregular notions of complexity can be applied to strings as well as trees. Doing so reveals that phonology and syntax are remarkably similar (and those parallels even extend into morphology and semantics). For instance, islands and blocking effects are instances of the same computational mechanism.
Subcategorization (or c-selection) is rarely studied by linguists, but it is actually a source of tremendous overgeneration. Once again subregular notions of complexity can be used to address this problem. This isn’t just a mathematical exercise, but makes concrete empirical predictions about the nature of category systems, subcategorization, the status of empty heads, the DP-analysis, DM-style roots, and once again highlights parallels to phonology.
The general upshot is that subregular concepts, despite their computational origin, are intuitive and linguistically fertile: they address conceptual issues, bridge gaps between linguistic subfields, and make concrete empirical predictions. Subregular linguistics is just linguistics with some computational flavor sprinkled on top.
Disclaimer: This talk is 100% formula-free.
Phonology/Phonetics/Psycholinguistics Guru: Matt Goldrick
This week (October 21-25) we will have a special visitor in the department, a Phonology/Phonetics/Psycholinguistics Guru, Matt Goldrick! Matt will be visiting the department all week. He will be giving two tutorials and a general talk (see below for schedule). Everyone in the department and beyond is welcome to attend all of these events. The schedule is rather complicated so please read it carefully – all events are scheduled to take place in N400 on Monday, Tuesday, and Wednesday of next week. Both tutorials are about Gradient Symbolic Representations and involve some hands-on software applications – one is focused on Phonology and the other on Processing. The talk is intended to be a general talk for the whole department. Matt is also available for individual meetings while he is here – please contact him directly about that.
Talk – “The acoustic effects of blended representations: co-production”
Gradient Harmonic Grammar (gradient underlying representations and learning models for them)
Instructions: Bring a laptop that can access the internet; you’ll be using Google Sheets to aid in calculations of harmony for candidate sets.
Gradient Symbolic Processing (connectionist implementations of GSR and software for generation, learning, and parsing of CFGs)