Natural Language Processing - Dipartimento di Informatica


DIPARTIMENTO DI INFORMATICA Università di Torino

Research Report Year 1997

Artificial Intelligence

People Research Activities Publications Software Products Research Grants

Natural Language Processing

People

Leonardo Lesmo	Associate Professor Principal investigator	lesmo(at)di.unito.it
Vincenzo Lombardo	Senior Researcher	vincenzo(at)di.unito.it
Guido Boella	Ph. D. student	boella(at)di.unito.it
Paola Rizzo	Ph. D. student	paola(at)di.unito.it
Liliana Ardissono	Research Collatorator	liliana(at)di.unito.it
Anna Goy	Research Collaborator	goy(at)di.unito.it
Cristina Barbero	Research Collaborator	barbero(at)di.unito.it
Dario Sestero	Research Collaborator	sestero(at)di.unito.it

Research activity in 1997

During the 1997, the research on the above mentioned areas has been carried on with the following results:

- Formal syntax and cognitive architectures for Natural Language Processing:

This research has the goal to devise formal systems for expressing the syntax of natural languages and to develop cognitive models of language processing given the operational constraints posed by the formal system.

The formalism for the description of the syntactic knowledge relies on the dependency paradigm. We have defined a formal system that is a first attempt to link the dependency-based theories with a computational model. We also have devised a parsing algorithm based on Earley's schema, that can parse the formalism in polynomial time. We are currently evaluating the expressive power of the formalism, in order to compare with other formalisms in the literature. From the implementation point of view, we have developed two prototype grammars, for Italian and English, and we are currently testing the parser on a wide range of syntactic phenomena. One of the most interesting tests for the formal system is the description of the coordination constructs, which are currently under study.

From the linguistic point of view, we have linked the formal system to a hierarchy of subcategorization frames, that are at the core of the dependency paradigm. We have devised a formal definition of the hierarchy, and we have applied it successfully to the description of 113 Italian verbs. We are currently testing the validity of the hierarchy on a wide set of verbal frames, and we are devising a method for compiling the hierarchy and using it in natural language parsing.

To build a cognitive architecture for the natural language parser, we have devised a flexible environment that is able to simulate the several psycholinguistic hypotheses on the sentence processor. When the parser faces an ambiguity, it can pursue a serial approach, by selecting a single path, or a parallel approach, by carrying on multiple paths for a window of the input sentence. Currently, we are studying the analysis of the relative clauses in English and the consequences that these complex phenomena bring to the tractability of the analysis. Then we will apply the preferences principle devised in psycholinguistics for the human processor.

In the area of the cognitive modelling, there is a collaboration with the Centre for Cognitive Science of the University of Edinburgh (Dr. Patrick Sturt), a research which have been partially funded by a CNR-BC joint research programme. Another collaboration involves the CNR Psychology Institute in Rome (Dr. Marica De Vincenzi): Vincenzo Lombardo and Marica De Vincenzi are currently editing a book on cross-linguistic perspectives on sentence processing.

- Lexical semantics:

The goal of this research is the development of a formalism for the representation of the meaning of words, which enables the implementation of automatic procedures for the semantic analysis of texts. The approach is based, mainly, on Case-Frame Semantics, but it embeds in it a reinterpretation of some concepts belonging to the Generative Lexicon theory (qualia structure e selective binding, in particular). The research activity has been developed around the core idea that the semantic content of lexical entries for nouns, verbs, and adjectives contains prototypical information, encodes as default values, which can be overridden by contextual knowledge (whenever it is available) and which allows one to derive (defeasible) inferences.

The research has been realized by a study of two main areas. The first one concerns the analysis of a large class (more then 300 verbs) of communication verbs (dire - say, chiedere - ask, raccontare - tell, ...), for which a representation formalism, that enlightens their complex internal semantic structure, has been developed, The second area concerns the study of some class of adjectives, in particular: (1) Relational adjectives (agricolo - agricultural, statale - State, ...), for which lexical representations and compositional mechanisms have been defined. (2) Some dimensional adjectives (alto - high/tall, basso - low/short, ...), the analysis of which shed some light on the interaction mechanisms between language and perceptual structures (visual ones, in particular). (3) Emotional adjectives (triste - sad, allegro - happy, divertente - amusing, ...), that have been classified on the basis of the different possible interpretations (depending on the modified noun) and for which the suited semantic representations, as well as the composition mechanisms have been defined. Moreover, this work led to the formulation of a hypothesis about the semantic representation for the lexical entries of nouns.

Altogether, the results of the research on lexical semantics enables the identification of close connections between semantics and pragmatics (in particular in the case of communication verbs) and between semantics and other cognitive structures, such as perceptual and emotional systems (in the case of adjectives). Moreover, the analysis concerning adjectives has stressed some limits of the Generative Lexicon approach leading to a reinterpretation of some of its mechanisms and to the extension of its formalism.

- Recognition of a speaker's plans in flexible user interfaces

In this research, a BDI (Belief, Desire, Intention) agent has been defined to model NL dialogue in cooperative environments. The main idea is that speakers are agents whose domain and communicative behaviour is goal-directed; moreover, linguistic behaviour has to be modelled uniformly with respect to domain behaviour, because agents use the same procedures to select and execute speech acts and domain actions. The agent has a two-level architecture, characterized by a declarative representation of the knowledge about acting: at the object level, the linguistic / domain actions represent the actions that the agent can perform to reach his goals; at the metalevel, the agent modelling plans describe the knowledge about how to build and execute plans, out of the available object-level actions. The metalevel plans describe the behaviour of a reactive planner and take the object (linguistic and domain) actions as its objects. The presence of a declarative representation actions and plans makes it possible to interpret the behaviour of an observed agent, as well as to generate the agent behaviour.

This agent architecture has been used to study several linguistic behaviours analyzed in the Computational Linguistics research. In particular, dialogue is modelled within the architecture in that each interactant is represented as an instance of an agent, and his behaviour is interpreted in terms of the (metalevel and object level) plans he is carrying on. Communication involves shared goals set forth by the interactants and the coherence of a dialogue (as well as that of a generic interaction) is recognized if the turns of each interactants contribute to the satisfaction of some contextual goals established previously. The coherence relations used in this framework derive from the Goal Adoption notion introduced by Castelfranchi and Parisi, and from the Plan Continuation relation used in the plan recognition literature to recognize the execution of the steps of a plan by an agent. The adoption of these relations in the dialog model have made it possible to explain and interpret in a single, general framework, several, apparently unrelated linguistic phenomena, like adjacency pairs, insertion sequences, presequences, overanswering, grounding phenomena, acknowledgments, direct and indirect speech acts, clarification subdialogues, repair turns to resolve misunderstandings and interpretation problems.

1997 Publications

L. Ardissono, G. Boella, and R. Damiano. A computational model of misunderstandings in agent communication. In Advances in Artificial Intelligence LNAI 1321, pages 48-59. Springer Verlag, Berlin, 1997.

L. Ardissono, G. Boella, and L. Lesmo. A plan-based formalism to express knowledge about actions. In Proc. 4th ModelAge Workshop: Formal Models of Agents, pages 255-268, Pontignano, Italy, 1997.

L. Ardissono, G. Boella, and L. Lesmo. Un'architettura di agente per la modellazione del dialogo in linguaggio naturale. In Atti dell'Incontro dei Gruppi di Lavoro dell'AI*IA su Apprendimento Automatico e Linguaggio Naturale), pages 110-113, Torino, 1997.

L. Ardissono, G. Boella, and R. Damiano. A plan-based model of misunderstandings in cooperative dialogue, Int. J. of Human-Computer Studies (to appear).

C. Barbero and V. Lombardo. Syntactic Classes in Representation and Processing. In Proc. AMLaP-97 Conference, ``Architectures and Mechanisms for Language Processing'', Edinburgh, 11-13 September 1997, 1997.

C. Barbero and V. Lombardo. Wide-coverage Lexicalized Grammars. In Advances in Artificial Intelligence LNAI 1321, pages 60-71, Berlin, 1997. Springer-Verlag.

P. Barboni and D. Sestero. Choosing a response using problem solving plans and rhetorical relations. In Proc. of 1st International Workshop on Human-Computer Conversation, page to appear, Bellagio, Italy, 1997.

A. Goy and L. Lesmo. Una proposta per la rappresentazione semantica lessicale. In Atti dell'Incontro dei Gruppi di Lavoro dell'AI*IA su Apprendimento Automatico e Linguaggio Naturale), pages 91-94, Torino, 1997.

A. Goy and L. Lesmo. Integrating lexical semantics and pragmatics: the case of italian communication verbs. In Proc. of the 2^nd Int. Workshop on Computational Semantics, pages 81-93, Tilburg, 1997.

V. Lombardo and P. Sturt. Incremental processing and infinite local ambiguity. In Proc. 19th Meeting of the Cognitive Science Society, Stanford 1997, 1997.

V. Lombardo and P. Sturt. Towards a convergence between psycolinguistic theories of sentece processing and efficient large scale parsers: a treebank study. In Atti dell'Incontro dei Gruppi di Lavoro dell'AI*IA su Apprendimento Automatico e Linguaggio Naturale), pages 89-93, Torino, 1997.

Software Products

Prototype of a POS (Part of Speech) Tagger for the Italian Language: It takes in input an ASCII file and produces a file where words have associated syntactic information (among which the category). It has been tested on over 10,000 manually tagged items.

Research grants

Title of project	Project leader	Funding Organization	Kind of grant
Pianificazione e Riconoscimento di Piani nella Comunicazione	L. Lesmo	CNR	Coordinated Project
Rappresentazione della conoscenza e meccanismi di ragionamento	A.Martelli (National Coordinator) P.Torasso (Local Coordinator)	MURST	ex 40%
Elaborazione Automatica del Linguaggio Naturale	L. Lesmo	Universita' di Torino	ex 60%
ILEX: Progetto per lo sviluppo di un lessico computazionale per la lingua Italiana.	L.Lesmo	IRST (Istituto per la Ricerca Scientifica e Tecnologica - Trento) Universita' di Venezia

Activity and role in the scientific community

L. Lesmo is the chairperson of the interest group on Natural Language Processing of AI*IA the Italian Association for Artificial Intelligence.

Leonardo Lesmo, Liliana Ardissono and Anna Goy were members of the program committee of the AA-LN workshop, held in Torino on December 9th and 10th, 1997.

Liliana Ardissono was a chair for the NLP session at the 4th Int. Conference on User Modeling held in Chia (Sardegna) in June 1997.

Anna Goy was member of the program commettee of the WLSS'97 workshop, held in Torino on April 21, 1997.

Oral Presentations in Congresses and Conferences

L. Ardissono, AA-LN workshop, Torino, 9-10 December, 1997.

G.Boella, 4th ModelAge Workshop on Formal Models of Agents, Certosa di Pontignano, January 15-17 , 1997

G.Boella, 5th Congress of the Italian Association for Artificial Intelligence, Rome, Italy, September 17-19, 1997

C. Barbero, 5th Congress of the Italian Association for Artificial Intelligence, Rome, Italy, September 17-19, 1997

A. Goy, AA-LN workshop, Torino, 9-10 December, 1997.

L. Lesmo, WLSS97 First Workshop on Lexical Semantic System, April 21,1997

V. Lombardo, XIX Annual Meeting of the Cognitive Science Society, Stanford University (USA), 6-9 August, 1997.

V. Lombardo, Computer Science Department, Penn University, Philadelphia (USA), 20 August, 1997.

V. Lombardo, 3rd Conference on the Architectures and Mechanisms for Language Processing, Edinburgh (UK), 11-13 September, 1997.

V. Lombardo, AA-LN workshop, Torino, 9-10 December, 1997.

[Information] [People] [Research] [Ph.D.] [Education] [Library] [Search]
[Bandi/Careers] [HelpDesk] [Administration] [Services] [Hostings] [News and events]


	Administrator: wwwadm[at]di.unito.it	Last update: May 17, 2018