Releases and Downloads
The entire corpus consists of ~10K questionss (~6k multiple-sentence questions).
We release about 60% of this data as training/dev data.
The rest of the data is saved for evaluation. Every few months
we will include a new unseen
additional evaluation data in CodaLab. The purpose of this is to prevent unintentional overfitting
over time, through many evaluations. Here is our current expected release plan: