Distributional learning and the development of word class categories in English, German and Dutch

How do children learn the grammatical categories of their language? For example, how do children learn that dog is a Noun and chase is a Verb? Recent research with computer models of language learning has shown that one way of doing this is to group words together on the basis of the words that come before and after them. For example, in English, words that come after a and the and before is and can tend to be Nouns, whereas words that come after is and can and before a and the tend to be Verbs.

However, at the moment, there are two problems with computer models that group words together in this way. The first is that they work better for some languages (such as English) than they do for others (such as German and Dutch). The second is that they tend to be unrealistic as explanations of human language learning because they do not learn gradually like children.

In this project we will develop a more child-like model of category learning that works across three different languages (English, German and Dutch). We will do this by taking ideas from recent computer models that group words together at a single point in time, and building them into a model called MOSAIC that learns language more gradually. MOSAIC takes as input speech directed at language-learning children in several different languages, and produces as output child-like utterances that get longer as the model learns more about the input. We can therefore test the model by comparing the utterances it produces with those of children learning different languages at different points in development, and so use it to develop a more realistic explanation of the way children learn grammatical categories.

Project Team: Julian Pine (Lead), Daniel Freudenthal, Fernand Gobet, Elena Lieven and Padraic Monaghan

Start Date: July 2015

Duration: 4 years

(Work Package 12)