More than meets the eye: how children can learn verbs from what they hear, not what they see.

How children learn verbs is one of the trickiest conundrums facing researchers in language acquisition. Nouns are easy: it’s not surprising that the first object names babies learn are for the objects they see and interact with on a day-to-day basis, like shoe, bottle, and blanket. Verbs are more of a challenge, though, because verbs refer to actions, which may only be visible for a short space of time (e.g., throw) – if at all (e.g., like). To make things more confusing, as well as learning the verbs themselves, children have to learn how to use them properly. For example, we know that five-year-olds tend to make mistakes like *mummy filled milk into the bottle or *daddy poured the bath with water. The question is, how do children learn to stop making these mistakes?

Over the years various explanations have been offered. Perhaps the most influential theory, which we’ll call the what they see approach, assumes that children decide how to use verbs based on the visual scene. Specifically, they mention the most-changed thing first (Pinker, 1989). So, they would say “mummy filled the bottle with milk” when the bowl becomes completely full but the movement of the milk isn’t noticeable, and “mummy poured milk into the bottle” when the movement of the milk is more visible, and the bottle doesn’t become completely full.

However, what if the bottle isn’t transparent, or they can’t see the milk? More generally, it’s likely that the visual information children need to decide how to use these verbs isn’t consistently available, especially for rare verbs like “infuse”. Luckily there’s an alternative way of learning this – the what they hear approach (Twomey, Chang & Ambridge, 2014). This explanation only requires children to listen to how verbs are used, and then copy it. These verbs tend to be followed by certain types of noun. For example, fill is usually followed by container-type words like bucket, cup and box, and pour is followed by liquid-type words like water, paint and juice. To avoid making a mistake, children only have to learn that fill is followed by a container-type word, and pour by a liquid-type word – intriguingly, they don’t need to look at anything at all.

Researchers from LuCiD tested these two theories by showing 5-year-olds, 9-year-old and adults animations of a robot on a spaceship performing made-up actions with two objects, for example filling a cone with blobs of oil using a shooting motion. In each action, one object was changed more than the other (e.g., the cone became completely full), but also, the experimenter described the action using a made-up verb (e.g., “pilked”) and either a container-type or a liquid-type noun (e.g., the robot pilked the oil). After seeing several of these “learning” animations, participants saw a new “recap” animation with the same action but new, equally-changed objects, and were asked to describe it. If the what they see theory is correct, then participants should describe the scene by mentioning the most-changed object in the training scenes first, for example the robot pilked the cone with oil. If the what they hear theory is correct, though, participants should describe the scene by mentioning the container-type or liquid-type object which came first in training, for example the robot pilked the oil into the cone.

Perhaps not surprisingly, 5-year-olds didn’t seem to be using either strategy. This might be because they needed more practice with the learning animations – after all, at this age they still make mistakes with this kind of verb. However, 9-year-olds and adults followed the what they hear strategy, placing the object that they’d heard first on the learning trials earlier in their descriptions of the scene. This suggests that older children can learn how to use these more difficult verbs by listening to how they are used by adults.

Excitingly, this is one of the first times children have been shown to be able to how verbs work without paying detailed attention to the visual scene.  While many studies have shown that quality and quantity of language matters for babies’ very first words, this work highlights shows it just as important to provide older them with plenty of opportunities to hear and use language.  A robot pilking oil into a cone – or is it pilking the cone with oil?

 A robot pilking oil into a cone – or is it pilking the cone with oil?


Further reading: 


Pinker, S. (1989). Learnability and cognition: The acquisition of argument structure. Cambridge, Mass.: Harvard University Press.

Twomey, K. E., Chang, F., & Ambridge, B. (2014). Do as I say, not as I do: A lexical distributional account of English locative verb class acquisition. Cognitive Psychology, 73, 41–71.



We are very grateful to all the children and adults who took part – we couldn’t do it without you! This work was funded by Leverhulme Research Project Grant RPG-158 and ESRC Centre Grant ES/L008955/1.


Leave a Comment

* Indicates fields are required