LINGUISTIC RESOURCES
Leonardo Lesmo, Vincenzo Lombardo, Cristina Bosco, Alessandro Mazzei, Livio Robaldo, Serena Villata
The linguistic resources currently available, most of which have been extended and enhanced in 2003, on the basis of the approach developed in the previous years, include the following:
- A Treebank represented in a Dependency format (see TUT)
- A Dependency Parser (see Dependency Parser)
- A morphological dictionary including about 23.000 word lemmas.
Other lines of research:
- A line of research developed by the NLP group concerns the study of formal methods for the representation of syntactic knowledge. Empirical psycholinguistic analyses show that incrementality is one important feature of human sentence processing. The work in this line of research focuses on the possibilty of introducing this characteristics into the grammars used by a syntactic unit of a natural language processor. Following this idea we defined a new grammatical formalism, called DV-TAG (Dynamic Version of Tag) and we studied its strong and weak generative power. DV-TAG enriches Lexicalized Tree Adjoining
Grammars formalism, a well studied mildly context sensitive formalism, by adding some constraints on the derivation process. In this way we respect the strong incrementality hypothesis, which restricts incrementality to the case in which the parser-generator mantains a fully connected tree at each state.
- Substantial advancements have been made on psycholinguistic models of sentence processing. New experiments have been carried out on the Penn Treebank, supporting the idea that the development of natural parsing strategies (in particular as regards the problem of prepositional attachment) is based on the available linguistic evidence. In particular, it has been shown that psycholinguistic criteria as Minimal Attachment and Late Closure are supported by the fact that the number of structures found in the corpus that respect the criteria is significantly greater (from a statistical point of view) than the number of structures that do not conform to them.
- New results have been obtained on NLP models based on neural net models.
Bibliography on linguistic resources
Last updated: 05.2009