Cross-linguistic acquisition of complex verb inflection in a neural network model
Usage-based approaches to language learning suggest that the acquisition of inflectional morphology and errors made by young learners are a function of the statistical properties of the input (e.g., Bybee & Moder, 1983). It has been shown that purely exposure-based computational models such as neural networks can approximate human error patterns not only in English verb and noun inflection (e.g., MacWhinney & Leinbach, 1991; Plunkett & Juola, 1999) but also in the more complex system of Serbian noun morphology (Mirkovic ́, Seidenberg, & Joanisse, 2011).
In order to test whether the acquisition of verb inflection can be simulated by a single exposure-based mechanism for two morphologically complex and dissimilar languages, we trained neural network models on the task of producing person/number inflected verbs in Finnish (FI) and Polish (PL). We compared the simulations with experimental results of elicited-production studies, where children at the age of about 50 months were shown animations and had to produce the inflected present-tense forms for 32 verbs.
Three-layer network models were presented with phonological representations of verb stems (e.g., FI: /roik:u/; PL: /r1suj/) together with a code for one target person/number context on the input layer and were trained to produce the complete inflected form on the output layer (e.g., FI: /roikut/; PL: /r1sujES/ for 2nd singular). In each language, 800 present-tense verbs (FI: 1785 forms; PL: 2419 forms) were presented probabilistically during training according to their token frequencies in child- directed speech corpora. While Finnish inflectional suffixes in verb forms are fairly regular, Polish suffixes are highly complex. On the other hand, Finnish features more complex stem alternations than Polish. By limiting the intermediate layer to 200 units, the models were forced nevertheless to generalise rather than rote-learn by relying on morphophonological subreguarities in order to select the appropriately inflected forms based on the input stems.
The models could correctly inflect over 99% of the training tokens after seeing 250,000 (FI) and 500,000 (PL) examples and correctly generalised 90% (FI) and 96% (PL) of unseen Tokens (see Figure 1). Learning in both models was facilitated for highly frequent forms and for verbs with high phonological neighbourhood density (a measure of phonological analogy). Suffix errors often res- ulted from overgeneralisation (i.e., producing the correct person/number context but from a different inflectional class) and occasionally from substitutions of low-frequency forms with higher-frequency forms (e.g., producing 3rd singular instead of 1st singular). Also see Figure 2.
The simulation results are broadly consistent with our experimental findings. The simulations in con- junction with the experiments suggest that a common learning mechanism underlies the acquisition of inflectional morphology cross-linguistically, and that this mechanism extracts subregularities in the distributional properties of the input. The model performance shows differences in error patterns between inflectional classes and between languages. We will discuss detailed error patterns at different training stages in the light of the characteristic properties of both languages.