Natural Language Processing


DIPARTIMENTO DI INFORMATICA Università di Torino

Research Report Year 2003

Computer Science

Artificial Intelligence and Human-Computer Interaction

People Research Activities Publications Software Products Research Grants

- People

Last and first name	Position	Email
Lesmo Leonardo	Full Professor	lesmo(at)di.unito.it
Lombardo Vincenzo	Associate Professor	vincenzo(at)di.unito.it
Boella Guido	Researcher	guido(at)di.unito.it
Damiano Rossana	Researcher	rossana(at)di.unito.it
Bosco Cristina	Research Assistant	bosco(at)di.unito.it
Mazzei Alessandro	PhD Student	mazzei(at)di.unito.it
Sauro Luigi	PhD Student	sauro(at)di.unito.it
Radicioni Daniele Paolo	PhD Student	radicioni(at)di.unito.it
Robaldo Livio	PhD Student	robaldo(at)di.unito.it

- Research activity in 2003

In 2003 the group carried out research activity on the following topics:
1. Syntactic analysis and robust methods for natural language processing.
2. Agent models.
3. Ontologies and Legal Knowledge
4. Computer music

1) Syntactic analysis and robust methods for natural language processing.

1.a -- Linguistic resources.

The linguistic resources currently available, most of which have been extended and enhanced in 2003, on the basis of the approach developed in the previous years, include the following:
- A morphological dictionary including about 23.000 word lemmas.
- A Treebank of 1800 Italian Sentences (approximately 50.000 words), represented in a Dependency format. More information on the treebank may be found at http://www.di.unito.it/~tutreeb/.
- A Robust Dependency Parser (including a morphological analyzer and a Part-of-Speech Tagger). The parser has been tested on a corpus of about 400.000 words. It analyzes about 120 word per second on a Pentium 4 processor under Linux in compiled LISP . The tests on the manually corrected treebank revealed an error rate (in terms of wrong attachments and wrong arc labels) of 16.7%. The parser also includes a preliminary treatment of traces, adopted for preserving the projectivity of the dependency trees.

1.b -- Psycholinguistic models of sentence processing.

Empirical psycholinguistic analyses show that incrementality is one important feature of human sentence processing. In particular, humans perform a directional analysis of the language, obtaining an incremental interpretation of sentences. From an applicative point of view, incrementality is important also for language modeling, a key sub-task for speech recognition systems. In this year, in cooperation with the psycholinguistic group of the Glasgow University, we designed and performed experiments to confirm the strong incremental hypothesis. This hypothesis regards the fully connectivity of the data structure used by the language processor in the sentence analyses.

1.c -- Formal methods for the representation of syntactic knowledge.

Taking in account the results of the experiment mentioned above, we refined the definition of the DV-TAG formalism. Lexicalized Tree Adjoining Grammars (L-TAG) is a well studied formalism that has several advantages with respect to the context-free grammars in the representation of syntactic knowledge about natural languages. In particular, the "extended domain of locality" allows a simple description of several linguistic phenomena, as the presence of traces and sub-categorization frames for the verbs in the analysis. Dynamic Version of L-LTAG (DV-TAG) is a formalism defined with the aim of augmenting L-TAG, by following the suggestions coming from the incrementality hypothesis.
We refined the definition of DV-TAG through the formalization of the dependency tree structure, a data structure that is in some way similar to the derivation tree of L-TAG but in other way similar to the analysis derived by a dependency grammar. Using this feature we were able to derive some difficult linguistic analyses, as the ones required for sentences with raising and bridge verbs, that standard L-TAG is not able to correctly describe. Apart from this, we begin the formalization of a polynomial parser for DV-TAG using the paradigm of the dynamic grammars. In particular we use an Earley-style strategy modified to take in account strong incrementality.
Another line of research regarding L-TAG concerns the automatic extraction of a grammar from a linguistically annotated corpus. In particular, we designed a new extraction algorithm with the aim to exploit the information carried by the dependency annotation of the TUT corpus.

2) Agent models

During year 2003, the research about autonomous agents theory focused on the formalization of normative multiagent systems and proceeded with the development and evaluation of planning strategies for agents who react to the changes in their environment.
For what concerns the formalization of normative multiagent systems we proposed to model normative systems using the agent metaphor: the normative system, even if it is a social entity, is described by attributing to it mental attitudes like beliefs, desires and intentions. In this way norms can be modelled as goals of the normative systems associated with the subgoals of considering as a violation any behavior not consistent with the norm and of sanctioning violations. The research includes modeling the reasoning process of an agent who is subject to norms; this is accomplished by making the agent foresee whether it will be sanctioned or not; in turn, this is obtained by recursively modeling the decision taken by the normative system considered as an agent. Further topics are the definition of permissions as exceptions to obligations, the structuring of the normative systems with roles, and the application of the agent metaphor to virtual communities and groups. This work is made in cooperation with the CWI research institute of Amsterdam.
The second line of research on agents concern planning strategies for dealing with the problem of updating the current intentions of an agent to face a new situation which occurred. We used as a basis of the research a decision theoretic hierarchical planner. The replanning process proceeds by making the plan found by the planner more partial and then refining it again without restarting the planning process by first principles. After the implementation of such algorithm, we made an evaluation phase to compare the performance of the replanning strategy with respect to the performance of a planner which has to build an entire new plan. The evaluation confirmed the prediction that our incremental replanning algorithm performs better than the corresponding planning algorithm when dealing with a failure.

3) Ontologies and Legal Knowledge

The activity on ontologies has been carried out in cooperation with the Department of Scienze Giuridiche of the University of Torino. The activity of the NLP group has mainly been concerned with the development of an ontological model of norms and obligation, with special attention to its connections to BDI models. In this context, norms are seen as constraints on the activity of agents, that they take into account when they determine their intentions on the basis of their beliefs, desires, and goals (planning). This is obtained by modeling the rationality of agents subject to the norms in terms of utility functions; the presence of a norm prohibiting a given action or a given state affects the utility of the plans including that action or leading to that state, so that an alternative line of behaviour may be chosen. The reduction of the utility value is obtained by associating a sanction with the norms, which affects the utility in a negative way. An analogous approach can be applied for modeling rewards instead of sanctions.
The study of some specific legal concepts (goods and fruits) has been extended and the associated ontological analysis has been integrated within a philosophically well-founded ontological framework (DOLCE, developed mainly by the Laboratory for Applied Ontology - ISTC-CNR, Trento, within the European project WonderWeb).
This research is being carried out also in partial cooperation with the Special Interest Group on Legal Ontologies of the European Network OntoWeb, in particular with groups located in Roma and Trento.

4) Computer Music

4.a -- Music performance

The goal of this area is the development of cognitive models for music performance. In particular we focus in the fingering problem, a difficult task in the case of string instrument, where the same note can be played on several positions. Fingering is an essential component of sound production, since the character of a piece is the result of the interaction between the musician and the instrument. Our experiments rely on a physical model of the classical guitar, and we compare the predictions of the model with the performances of human experts on the same piece.

4.b -- Algorithmic composition

In this area we explore the possibility of automating some aspects of the composition process with a formal model. In particular, we focus on the composition process with granular synthesis, a technique that extends the notion of music events from the standard note approach to the grain level of a sound waveform. We have proposed a two-level method for the representation of a music composition based on a graph navigation.

- 2003 Publications

· Carla Bazzanella, Cristina Bosco Contextualization in spoken language corpora. In C.D. Pusch and W. Raible (eds.), Romanistische Korpuslingustik. Korpora und gesprochene Sprache / Romance Corpus Linguistics. Corpora and Spoken Language. Tübingen, 2003, Gunter Narr Verlag.
· Francesca Biral, Vincenzo Lombardo, Rossana Damiano, Antonio Pizzo, Cyrano goes to Hollywood: a drama-based metaphor for information presentation, Atti del Fourth Workshop Artificial Intelligence in Mobile Systems (AIMS 2003), Seattle, 2003.
· Cristina Bosco and Carla Bazzanella: Corpus linguistics and the modal shift in Old and Present-Day Italian: temporal pragmatic markers and the case of 'allora'. In C. Push (editor), Procs. of the 2nd Freiburg Workshop on Romance Corpus Linguistics – Corpora and historical linguistics, Freiburg im Breisgau, Germany, Draft, 2003.
· Guido Boella and Rossana Damiano: Empirical evaluation of a replanning algorithm. In Procs. of ICAPS Workshop on plan execution, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Access control in virtual communities: Prohibition, permission, authorization and delegation of power in the grid. In Procs. of Knowledge Grid and Grid Intelligence workshop at WI/IAT'03 (KGGI'03), 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Attributing mental attitudes to normative systems. In Procs. of AAMAS'03, Melbourne, 2003. ACM Press. (PostScript)
· Guido Boella and Leendert van der Torre: Attributing mental attitudes to groups: Cooperation in a qualitative game theory. In Procs. of Collaboration Agents: Autonomous Agents for Collaborative Environments at WI/IAT'03 (COLA'03), 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Bdi and boid argumentation: Some examples and ideas for formalization. In Procs. of IJCAI Workshop on Computational Models of Natural Argument, Acapulco, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Decentralized control: Obligations and permissions in virtual communities of agents. In Procs. of ISMIS, Maebashi, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Division of powers in MAS control. In Procs. of AAMAS Workshop on Autonomy, Delegation and Control, Melbourne, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Game specification in the trias politica. In Procs. of BNAIC'03, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Local policies for the control of virtual communities. In Procs. of IEEE/WIC Web Intelligence Conference, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Norm governed multiagent systems: The delegation of control to autonomous agents. In Procs. of IEEE/WIC IAT Conference, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Obligations and permissions as mental entities. In Procs. of IJCAI Workshop on Cognitive Modeling of Agents and Multi-Agent Interactions, Acapulco, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Obligations as social constructs. In Procs. of the AI*IA Conference, Pisa (Italy), 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Permissions and obligations in hierarchical normative systems. In Procs. of ICAIL 03, pages 109-118, Edimburgh, 2003. ACM Press. (PostScript)
· Guido Boella and Leendert van der Torre: Permissions and undercutters. In Procs. of IJCAI Workshop on Non Monotonic Reasoning, Actions and Causality, Acapulco, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Policy management for virtual communities of agents. In Procs. of WOA'03 Workshop, 2003. (PostScript)
· Guido Boella and Leendert van der Torre: Rational norm creation. In Procs. of ICAIL 03, pp. 81-82, Edimburgh, 2003. ACM Press. (PostScript)
· Guido Boella and Leendert van der Torre: Your wish is my command: Sanction-based obligations in a qualitative decision theory. Draft, 2003. (PostScript)
· Cristina Bosco and Vincenzo Lombardo: A relation-based schema for treebank annotation. In A. Cappelli, F. Turini (eds.) Advances in Artificial Intelligence – LNCS 2829, Springer Verlag, Berlin, 2003, 462-473.
· Fabrizio Costa, Paolo Frasconi, Vincenzo Lombardo, Giovanni Soda: Towards Incremental Parsing of Natural Language using Recursive Neural Networks, Applied Intelligence 19 (1-2), 2003, 9-25.
· Rossana Damiano, Vincenzo Lombardo, Francesca Biral, Antonio Pizzo: Cyrano: a character-centered architecture for interactive presentations, Atti del Simposio Human-Computer Interaction in Italy (HCI-Italy), Torino, 2003.
· Alessandro Mazzei: Formalizing a constituency based dynamic grammar. In Balder Ten Cate, editor, Proc. of the Eighth ESSLLI Student Session, Vienna, 2003, 181-190.
· Daniele Radicioni and Vincenzo Lombardo: Computational modeling of chord shapes in guitar fingering, Proceedings of the 3rd International Workshop on Gestural Analysis (GW2003), Genova, 2003.
· Patrick Sturt, Fabrizio Costa, Vincenzo Lombardo, Paolo Frasconi, Learning structural first-pass attachment preferences with dynamic grammars and recursive neural networks, Cognition 88, 2003, 133-169.
· Patrick Sturt and Vincenzo Lombardo: Grammatical theory and incremental processing, Atti della 16th CUNY Conference on Sentence Processing, 2003.
· Andrea Valle and Vincenzo Lombardo: A two-level method to control granular synthesis, Proceedings of the 14th Colloquium on Musical Informatics (CIM 2003), Firenze, 2003, 136-140.

- Software Products

Dependency Parser
Downloadable from the NLP Group site.
Tested under Linux on Allegro Common LISP and CLISP.
Including the Italian morphological dictionary, the morphological analyzer, the POS tagger, various documents and user manual
Ref. Leonardo Lesmo

- Research grant

1. Progetto di Rilevante Interesse Nazionale - COFIN 2003 - durata biennale: 2003-2004
(Project of Relevant National Interest - COFIN 2003 - two years: 2003-2004)
Titolo: Tecnologie informatiche per il supporto allo sviluppo di tassonomie ed ontologie giuridiche (Computer Science for Supporting the Development of Legal Taxonomies and Ontologies)
Responsabile Scientifico (Scientific Responsible): Leonardo Lesmo
Coordinatore Scientifico a livello Nazionale (National Coordinator): Gianmaria Ajani, Dipartimento di Scienze Giuridiche, Università di Torino
Finanziato da (funded by): MIUR
Finanziamento Totale (Total Grant): 22.600 Euro

[Information] [People] [Research] [Ph.D.] [Education] [Library] [Search]
[WAP Site] [Administration] [Services] [Hostings] [News and events]


Administrator: wwwadm[at]di.unito.it	Last update: May 05, 2004

Research Report Year 2003

Computer Science Artificial Intelligence and Human-Computer Interaction

Natural Language Processing

Computer Science

Artificial Intelligence and Human-Computer Interaction