All the data in the datasets are covered by the following license.
Turin University Treebank (TUT)
by
Cristina Bosco, Leonardo Lesmo, Vincenzo Lombardo,
Alessandro Mazzei, Livio Robaldo
is licensed under a Creative
Commons Attribution-Noncommercial-Share Alike 2.5 Italy License.
In order to foster the comparison among different paradigms and annotations, both the annotated data for the development and training and the unannotated data for testing the participants' results will be the same for Consituency and Dependency Parsing track.
Development data
Test data
The test set for the parsing task is composed by 300 new sentences, and is balanced like the development
set: 150 sentences from legal texts, 75 from newspapers and 75 from Wikipedia.
NEW: November the 2nd
The annotated gold test set exploited for the evaluation of participant results is currently available. You
can download it from he following links:
Any further upgraded version of data, if available, will be announced to registered participants
and published in this site.
Requests of information and feedbacks about data are welcome and can be addressed to
bosco[at]di.unito.it.
DEADLINE:
The submission of results has to be done not later that the October the 14th 2011.
The submission deadline has been postponed to October 21th.
Each participant should submit the result sending, to the organizers email address
(bosco[at]di.unito.it), one file in the correct format for each track he/she want to participate to.
ONE FILE:
Only one result for each track will be accepted for each participant.
DATA FORMAT:
Only results in the correct format will be accepted.
The correct format is the same as in the development data, and in particular consists in:
FILE NAME:
The name of the file containing the result must show the name of the task, the track and
the participant organization and name, as in the following example:
EVALITA11_PAR_TRACK_Org-Participantname
where TRACK is the name of the track (COS for Constituency and DEP for Dependency Parsing), and
Org-Participantname is the name of the organization and the surname of the participant.
After the submission deadline, the organizers will evaluate the submitted files and will send each participant
the score of his/her submission(s) as well as the gold-standard version of the test set(s).
Updated deadlines are available at the Evalita 2011 "Important dates" web page.
[documents and tools] [datasets] [submission] [deadlines] [TUT homepage] [Evalita 2011 homepage]