Verb and Preposition SRL Annotations for the CHILDES Corpus
Christos Christodoulopoulus, Cindy Fisher, Sandra Franco, Lori Moon, and Dan Roth
This corpus contains gold standard annotations of all verb and preposition senses with corresponding semantic role labels in adult-child dialogues. It is derived from the Adam files in the Brown corpus (Brown 1973) from the CHILDES corpora. The derived corpus contains annotations for every sentence containing a verb or preposition in the corpus. The semantic role labels and senses for verbs follow Propbank guidelines (Kingsbury & Palmer 2002; Gildea & Palmer 2002; Palmer et al. 2005) and those for prepositions follow Srikumar & Roth 2011.
The corpus was annotated by two annotators. The overall average kappa score for inter-annotator agreement is 0.92. The derived corpus is available here in XML format
Notes on how the information appears in XML and CHAT format is given here.
If you use this corpus, please cite Moon et al. (submitted 2018), as well as following the citation guidelines for CHILDES and CHAT on TalkBank TalkBank.
References
- Brown, R. 1973. A First Language: The Early Stages. Cambridge: Harvard University Press.
- Gildea, Dan and Martha Palmer. 2002. The necesity of parsing for predicate argument recognition. Proceedings of the ACL 2002. Philadelphia, Pennsylvania.
- Kingsbury, Paul and Martha Palmer. 2002. From Treebank to Propbank. Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC-2002). Las Palmas, Spain.
- MacWhinney, Brian. 2000 The CHILDES Project: Tools for Analyzing Talk. Lawrence Erlbaum Associates, Inc., 3rd edition.
- Moon, Lori, Christos Christodoulopoulus, Cindy Fisher, Sandra Franco, and Dan Roth. 2018 manuscript. CHILDES Corpus Annotation of Verb and Preposition Semantic Roles.
- Palmer, Martha, Dan Gildea, and Paul Kingsbury. 2005. The Proposition Bank: A corpus annotated with semantic roles. Computational Linguistics Journal, 31(1).
- Srikumar, Vivek, and Dan Roth. 2011. A joint model for extended semantic role labeling. Proccedings of EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing. 129-139.