ARCADE-ROMANSEVAL  
Data from the 1998 evaluation exercice

Edited by Jean Véronis, Université de Provence

   

 

Description
   
  This CD-ROM contains the data that was used within the ARCADE and ROMANSEVAL evaluation projects during the 1998 competition. ARCADE is a project funded by AUPELF-UREF , whose goal is the evaluation of parallel text alignment systems. ROMANSEVAL is part of the SENSEVAL exercice on word sense disambiguation sponsored by SIGLEX (ACL Special Interest Group) and EURALEX, and is devoted to Romance languages.
   
More information
   
  Arcade: http://www.up.univ-mrs.fr/veronis/arcade

Romanseval: http://www.up.univ-mrs.fr/veronis/romanseval

   
Contact
   
  Jean Véronis, Professor and Director

Centre Informatique pour les Lettres et Sciences Humaines
Université de Provence
29, Avenue Robert Schuman  
13621 Aix-en-Provence Cedex 1, France  

tel : (+33) 4 42 95 31 37  
fax : (+33) 4 42 59 50 96  
e-mail: Jean.Veronis@up.univ-mrs.fr 

   
CD-ROM contents
   
 
   
Acknowledgements
   
  I would like to thank all the people without whom this CD-ROM would not have been possible:
  • the Arcade participants, especially Philippe Langlais, MichelSimard and all the members of the RALI team who invested a lot of efforts in ARCADE;
  • Joseph Mariani, the coordinator of the AUPELF-UREF Language Engineering ARCs, for his constant help and support;
  • Frédérique Segond, with whom I organised the ROMANSEVAL competition;
  • the Italian teams who prepared the Italian data for ROMANSEVAL (Istituto di Linguistica Computazionale del CNR, Universita' di Tor Vergata, CELI), under Nicoletta Calzolari's coordination;
  • Khalid Choukri and the ELRA team for their faith in this project and their help;
  • the many annotators who worked on the data, especially my students Valérie Houitte, Corinne Jean and Marie-Dominique Mahimon.
   
Disclaimer
   
  I am perfectly aware that the data provided in this CD-ROM is by no means perfect: it is likely that, despite our best efforts, some errors remains, and with more time (and the experience we have now gathered), we would have done a better job. However, it was useful for the evaluation exercises as it is, and we hope that other teams will find it useful as well.In any case, the corpus is provided on an "as is" basis, and I am not liable, neither are Université de Provence, RALI, ELRA, AUPELF-UREF or any of the ARCADE and ROMANSEVAL participants for any problems or damage that the use or misuse of the data could cause to anyone. I also assume that before distribution, ELRA has obtained all necessary authorisations and cleared copyright issues concerning the data used. I would be extremely grateful if users could report bugs and problems, and provide comments, ideas, and, why not, enhancement of the data, which could benefit all of us.