iLexIR

NLP Consultancy

CLC FCE Dataset

The CLC FCE Dataset is a set of 1,244 exam scripts written by candidates sitting the Cambridge ESOL First Certificate in English (FCE) examination in 2000 and 2001.

The scripts are extracted from the Cambridge Learner Corpus (CLC), developed as a collaborative effort between Cambridge University Press and Cambridge Assessment.

For each exam script, the CLC FCE Dataset includes the original text written by the candidate (transcribed and anonymised, but otherwise unmodified) as well as marks, error annotation and essential demographic details including the candidate’s first language and age bracket.

Licence

The Dataset is released for non-commercial research and educational purposes under the following licence agreement:

  1. By downloading this dataset and licence, this licence agreement is entered into, effective this date, between you, the Licensee, and the University of Cambridge, the Licensor.
  2. Copyright of the entire licensed dataset is held by the Licensor. No ownership or interest in the dataset is transferred to the Licensee.
  3. The Licensor hereby grants the Licensee a non-exclusive non-transferable right to use the licensed dataset for non-commercial research and educational purposes.
  4. Non-commercial purposes exclude without limitation any use of the licensed dataset or information derived from the dataset for or as part of a product or service which is sold, offered for sale, licensed, leased or rented.
  5. The Licensee shall acknowledge use of the licensed dataset in all publications of research based on it, in whole or in part, through citation of the following publication: Yannakoudakis, Helen and Briscoe, Ted and Medlock, Ben, ‘A New Dataset and Method for Automatically Grading ESOL Texts’, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.
  6. The Licensee may publish excerpts of less than 100 words from the licensed dataset pursuant to clause 3.
  7. The Licensor grants the Licensee this right to use the licensed dataset ‘as is’. Licensor does not make, and expressly disclaims, any express or implied warranties, representations or endorsements of any kind whatsoever.
  8. This Agreement shall be governed by and construed in accordance with the laws of England and the English courts shall have exclusive jurisdiction.

Download

You may download the CLC FCE Dataset if you agree to the licence.