From Joe Pater
I came across an interesting blog post the other day discussing the practice of posting conference papers to arXiv in NLP and machine learning before they have been reviewed. It includes some data from a poll on how people use it in each discipline – machine learning people tend to post earlier in the publication cycle, perhaps due to an influential call for a new publishing model by Yann Le Cun of deep learning fame, and perhaps due to a greater fear of being scooped.
This got me thinking again about archives in our discipline. I came of academic age at the time that ROA was launched, and it was fantastic as a grad student to have access to the latest research in the framework I was using, and to be able to share my own work so easily. As I’ve told Alan Prince already, we’re hugely in his debt for having established that archive, and we also owe a huge thanks to Eric Baković and others for all their work on it, as we do to Michal Starke and others at LingBuzz.
It’s clear, though, that in contrast with the situation in computer science, use of archives is on the decline in phonology. I post to them only sporadically myself, generally only making time to keep my own web page updated. In contrast to when ROA was founded, the preservation function of an archive is less required; most of us have archives serving this purpose at our own institutions (see e.g. John McCarthy’s ScholarWorks archive), and who knows, maybe a document hosted on a google drive will last longer than one on a university site. I find the google drive alternative particularly convenient because it’s so easy to update a paper. And this brings up the main issue in my mind for posting to archives early in the publication cycle: if you have your paper in multiple places, you need to update multiple copies, each with considerably more hassle than a google drive.
Preservation is only one function of these archives, and it’s far less important than another: dissemination. For dissemination, one’s own webpage, or institutional archive, is not a viable alternative. The main impetus for Phonolist was to facilitate dissemination for papers that weren’t being posted to the archives, and it seemed that the added functionality of optional blog discussion of papers would make it attractive for that purpose. I’ve been somewhat surprised to see that people haven’t been using it much for that (most of the papers we advertise are reposts from LingBuzz and ROA).
Phonolist currently lacks any indexing functionality (besides searches), and this is one way that it could be improved to better serve the cause of dissemination. This will likely be an upcoming addition, along with a community .bibtex file.
The question I’d like to bring up for discussion is whether people perceive the need for a general phonology archive, and if so, what it should look like. ROA is limited to OT and its affiliates, and LingBuzz has technical issues that have made it frustrating to use, and I’ve heard that it’s unlikely to be improved. My limited experience with academia.edu and researchgate has been negative. I thought an easy fix might be to start using http://cogprints.org, but in response to my inquiry about it, Stevan Harnad said “CogPrints has no long-term support and I would say it’s obsolete (though I’m still keeping it up).” More generally, I’d be interested to hear people’s thoughts about how they use the existing archives, and why they don’t use them.