Investigating the Quality of Language in Children’s YouTube Videos: My LuCiD Summer Internship Journey

Ever pondered the intricacies of how young minds unravel the complexities of language, peeling back the layers of their cognitive processes to comprehend its nuanced structures? This is exactly what I looked at for a period of 6-week during summer which was sponsored by LuCiD, the ESRC International Centre for Language and Communicative Development. This journey promises to be a spellbinding exploration into the heart of scientific inquiry, as I delve into the captivating project, 'Investigating the Quality of Language in Children’s YouTube Videos’.

I was given the opportunity to work closely with experts like Professor Padraic Monaghan from Lancaster University, Dr. Joanna Kolak from Salford University, and Dr. Gemma Taylor, also from Salford University. This luminary trio helped me from the get-go to understand and break down language to take a closer look at what kind of language was used in YouTube videos that were catered to children. They already had a complied list of 188 videos and my job was to go through each of them and transcribe, split the transcription into utterance and analyse it.

After that, Dr. Jonna Kolak and I tackled the challenge of breaking these transcriptions into chunks we called "utterances." This wasn't just a simple cut-and-paste job. It involved carefully watching the videos, figuring out where one piece of talking ended and the next one began. Think of it like finding the natural pauses in a chat. Dr. Kolak and I worked closely together, making sure we were both on the same page. To make sure our approach was reliable, we took a closer look at a handful of videos and worked on these utterances independently, and then we compared our versions of utterances using a tool called R Studio. This helped us check if we had good interrater reliability.  Luckily, our work matched up pretty well, giving us confidence in our method.

With the segmentation locked in, we moved on to the next step—grammatical coding. This was like putting labels on the different types of sentences. Dr. Kolak generously shared her knowledge of linguistics, helping me understand the ins and outs of categorizing each utterance. We categorize each utterance into specific classifications—Fragment, Question, Imperative, Copula, Subject-Predicate, and Complex. We then also checked the reliability on this. Whilst I was still working on applying this to the list of videos, Professor Padraic Monaghan who was monitoring my progress helped me learn more about R studio. He demonstrated how they will be performing the analysis after all the data was transcribed, split into utterances, and coded. Dr. Gemma Taylor was also monitoring my progress and always was willing to help me out when I had technical difficulty with organization and coding in excel.

Overall, this experience was unlike anything I have done before. It has left an indelible mark, shaping my understanding of linguistics and honing practical skills. From the intricate art of transcribing spoken language to the meticulous process of segmenting and coding, I've gained hands-on expertise. The collaboration with Dr. Jonna Kolak not only fortified my teamwork skills but provided a real-world application of linguistic analyses. Beyond language studies, the immersion in R programming and Excel tasks significantly elevated my technical capabilities. This journey was more than a learning opportunity; it was a transformative experience that equipped me with invaluable skills, preparing me for diverse challenges in both language analysis and technical arenas.


Leave a Comment

* Indicates fields are required