Did Frank Rosenblatt invent deep learning in 1962?

Deep learning (Le Cun et al. 2015: Nature) involves training neural networks with hidden layers, sometimes many levels deep. Frank Rosenblatt (1928-1971) is widely acknowledged as a pioneer in the training of neural networks, especially for his development of the perceptron update rule, a provably convergent procedure for training single layer feedforward networks. He is less widely acknowledged for his pioneering work with other network architectures, including multi-layer perceptrons, and models with connections “backwards” through the layers, as in recurrent neural nets. A colleague of Rosenblatt’s who prefers to remain anonymous points out that his “C-system” may even be a precursor to deep learning with convolutional networks (see esp. Rosenblatt 1967). Research on a range of perceptron architectures was presented in his 1962 book Principles of Neurodynamics, which was widely read by his contemporaries, and also by the next generation of neural network pioneers, who published the groundbreaking research of the 1980s. A useful concise overview of the work that Rosenblatt and his research group did can be found in Nagy (1991) (see also Tappert 2017). Useful accounts of the broader historical context can be found in Nilsson (2010) and Olazaran (1993, 1996).

In interviews, Yann Le Cun has noted the influence of Rosenblatt’s work, so I was surprised to find no citation of Rosenblatt (1962) in the Nature deep learning paper – it cites only Rosenblatt 1957, which has only single-layer nets. I was even more surprised to find perceptrons classified as single-layer architectures in Goodfellow et al.’s (2016) deep learning text (pp. 14-15, 27). Rosenblatt clearly regarded the single-layer model as just one kind of perceptron. The lack of citation for his work with multi-layer perceptrons seems to be quite widespread. Marcus’ (2012) New Yorker piece on deep learning classifies perceptrons as single-layer only, as does Wang and Raj’s (2017) history of deep learning. My reading of the current machine learning literature, and discussion with researchers in that area, suggests that the term “perceptron” is often taken to mean a single layer feedforward net.

I can think of three reasons that Rosenblatt’s work is sometimes not cited, and even miscited. The first is that Minsky and Papert’s (1969/1988) book is an analysis of single-layer perceptrons, and adopts the convention of referring to them as simply as perceptrons. The second is that the perceptron update rule is widely used under that name, and it applies only to single layer networks. The last is that Rosenblatt and his contemporaries were not very successful in their attempts at training multi-layer perceptrons. See Olazaran (1993, 1996) for in-depth discussion of the complicated and usually oversimplified history around the loss of interest in perceptrons in the later 1960s, and the subsequent development of backpropagation for the training of multilayer nets and resurgence of interest in the 1980s.

As for my question about whether Rosenblatt invented deep learning, that would depend on how one defines deep learning, and what one means by invention in this context. Tappert (2017), a student of Rosenblatt’s, makes a compelling case for naming him the father of deep learning based on an examination of the types of perceptron he was exploring, and comparison with modern practice. In the end, I’m less concerned with what we should call Rosenblatt with respect to deep learning, and more concerned with his work on multi-layer perceptrons and other architectures being cited appropriately and accurately. As an outsider to this field, I may well be making mistakes myself, and I would welcome any corrections.

Update August 25 2017: See Schmidhuber (2015) for an exhaustive technical history of Deep Learning. This is very useful, but it doesn’t look to me like he is appropriately citing Rosenblatt: see secs. 5.1 through 5.3. (as well as the refs. above, see Rosenblatt 1964 on the on the cat vision experiments).

Non-web available reference (ask me for a copy)

Olazaran, Mikel. 1993. A Sociological History of the Neural Network Controversy. Advances in Computers Vol. 37. Academic Press, Boston.

Slides

Tappert, Charles. 2017. Who is the father of deep learning? Slides from a presentation May 5th 2017 at PACE University, downloaded June 15th from the conference site.

Rosenblatt, with the image sensor of the Mark I Perceptron (Source: Arvin Calspan Advanced Technology Center; Hecht-Nielsen, R. Neurocomputing (Reading, Mass.: Addison-Wesley, 1990).)

The Mark 1 Perceptron (Source: Arvin Calspan Advanced Technology Center; Hecht-Nielsen, R. Neurocomputing (Reading, Mass.: Addison-Wesley, 1990).)

Posted in Learning
4 comments on “Did Frank Rosenblatt invent deep learning in 1962?
  1. Fred Mailhot says:

    I just skimmed about 5 pages of the e-book (which, happily, is available for downloading at http://www.dtic.mil/docs/citations/AD0256582) and would definitely call at least some of what Rosenblatt did deep learning. There are nets w/ 2 hidden layers and modifiable connections…to not call that “deep learning”, irrespective of the error-correction/training algorithm used, just seems perverse to me.

    Sidebar: it’s really only this year that I’m starting to see DL being good about citing work from the 80s and 90s.

  2. Joe Pater says:

    Right. One way Deep Learning is sometimes defined is as using a multilayer perceptron with more than one hidden layer, so even with that definition, Rosenblatt was using deep nets.

  3. Stefano Rovetta says:

    Rosenblatt invented deep network (actually, elaborated on many ideas including McCulloch and Pitts work), although the depth used in the 50s and 60s (and up to the 90s) would be considered “shallow” by todays’ standards.

    Unfortunately he did not invent deep LEARNING. His learning algorithm (the Perceptron Algorithm) only works for shallow learning, namely, 1-layer deep networks. Rosenblatt’s perceptrons were deep but only the last layer included learning.

    • Joe Pater says:

      I think this is a nice summary of the facts. In retrospect, the title of my post might be misconceived: it should have been “Why doesn’t Frank Rosenblatt get the credit he deserves”, or something like that. I should say, though, that he certainly *did* do learning with multilayer perceptrons – he just didn’t succeed in finding a good algorithm for it.

Leave a Reply

Your email address will not be published. Required fields are marked *

*