Toolkit

As part of the Language Researchers' Toolkit project, our researchers have created programmes to make it easier to analyse CHILDES corpora. 

CHILDES Browser

This browser allows you to examine various statistics to identify which corpus to use.  

Childes2csv

Childes corpora are provided in various formats, but it can be difficult to convert them into data.frames which can be used in R for various analyses. Childes2csv is a programme that generates CSV files directly from CHILDES XML files. It can generate word or utterance level corpora at different levels (e.g. dialects, languages etc.).

Filter Combine

Another task that one often needs to do is to filter corpora for words/utterances that match particular rules. Then recode some of the columns into numeric format, group the data along other columns and collect some statistics.

Ngrams

This allows you to examine 1- to 4-grams in various corpora. It also displays Zipfian log-frequency log-rank curves.

Other toolkit links

There is information about other automatic tools at the toolkit page.