Next: The dialogue interpretation Up: Using Dynamic User Previous: Background

The dialogue interpretation framework

Our framework is concerned with information-seeking dialogues in the domain of a CS Department [Ardissono et al. 1993b].

The general knowledge about actions and plans is stored in a plan library structured on the basis of two main hierarchies: the Decomposition Hierarchy (DH) and the Generalization Hierarchy (GH) [Kautz and Allen1986]. The DH is used for explaining how to execute complex actions. In particular, actions may be elementary or complex and, in the second case, one or more plans may be associated with them. The GH makes explicit the relation between more specific and more general actions and is used for exploiting inheritance in their definition. Each action is characterized by its preconditions, constraints, restrictions on the parameters, postconditions and decomposition. Figure represents a portion of the plan library, concerning the PREP-AND-TAKE-CS-EX action. In the figure, dashed arcs are specialization links between actions, while solid arcs are used for identifying the preconditions, postconditions, constraints and decompositions of actions. For the sake of simplicity, the definition of the elementary actions has been omitted.

Figure: A portion of the Plan Library

In the GH, more specific actions may be related to more general ones by a subsumption relation among the types of their parameters. In the figure, USE-PC and USE-WORKST are more specific than USE-COMPUTERS, because they have an additional restriction on their second parameter: while ``comp'' is restricted to be a computer in USE-COMPUTERS, in the subsumed actions it is respectively a PC or a workstation. Moreover, actions may be more specific than others because they represent specific modalities of execution (corresponding to alternative decompositions). For example, talking to people by phone or face to face are two alternative ways of communicating.

We assume that the system has complete knowledge about the actions that may be performed in the domain and that the users' world models are consistent with the system's (in general, they may be subsets). In particular, users are not aware of any actions (or ways of executing actions) other than those represented in the plan library (the appropriate query assumption, [Pollack1990]). So, the system is always able to find some plans associated with their utterances. However, the system may not be informed about the truth value of very specific facts (e.g. it may not know whether a certain book is present in the library): in such cases, it is able to suggest a plan for retrieving the information.

The interpretation framework is composed of four main modules:

A Natural Language interpreter (NLI) that builds a semantic representation of the input sentences (SR) and keeps the contextual information necessary to solve referential problems [DiEugenio and Lesmo1987], [Lesmo and Terenziani1988], [Ardissono et al. 1991].
A Plan Recognition module (PR), that identifies the plans associated with the sentences expressed by the user during the dialogue.
A User Modeling component (UMC), that collects significant information about the beliefs, knowledge and interests of the user and elaborates on them as the dialogue goes on.
A Response Production module (RP) (at the moment not deeply designed) that collects the contents of the replies that the system is supposed to give to the user's utterances.

Figure: The general architecture

Figure , which represents a schema of the architecture, shows the information flow across the various modules. The semantic representation of the input sentence (SR) is used by the PR module as the basis for the identification of the users' plans (represented as CMs) and it is also fed to the UMC component, that uses it to acquire information about users on the basis of what they say. The interaction between PR and UMC is the core of this paper: the information in the UM is used (by means of PR requests to UMC) for identifying as precisely as possible the users' plans. Vice versa, UMC analyses the contents of the active CMs (by means of requests to PR) to reason over them in order to make safe inferences in the UM. Finally, RP expands the active CMs with the information necessary to reply to the users' utterances with answers suitable to their competence in the domain. The UM is then updated to take into account the new supplied knowledge.

In the following we will introduce the PR and UMC modules: on the other hand, we will not discuss the NLI and RP components, which are outside the scope of the paper. The plan recognition module is based on the framework proposed in [Carberry1988], where CMs represent the system's hypotheses on the user's plans collected during the dialogue. In her system, the ambiguity among the hypothesized plans is solved by means of an evaluation of their plausibility mainly based on the intentions explicitly stated during the dialogue. However, as we already pointed out, knowledge about users is very important in the disambiguation of their intentions; so, we use the information collected in the UM to select the set of most promising hypotheses built during a dialogue. If, at a certain point, some unsolvable ambiguities arise that cannot be ignored because they affect the system's reactions to the user's utterances, we start a clarification dialogue to investigate the user's real plans. We believe that an important aspect in the generation of the answer is the trade-off between the need to identify with precision the users' plans and goals and the requirement of keeping clarification dialogues short to increase the acceptability of the system. This is why the use of the information collected by means of user modeling techniques may have a strong impact on the naturalness of dialogues. In order to take into account the fact that the system can also provide an answer in cases where its knowledge of the domain is limited, we (see [Ardissono et al. 1993a]) have adopted and refined the notion of relevance of ambiguities introduced in [van Beek et al. 1993].

Figure: A piece of the Plan Library and an example of Context Model

In the construction of the UM, the system uses stereotypical information and acquisition rules that infer users' beliefs and knowledge directly from their utterances. The UM is then expanded by means of inferences on the beliefs already contained in it. The presence of a rich UM is very important in the resolution of the ambiguities among the active hypotheses on a user's plans. Moreover, it enhances the flexibility and the acceptability of the system, because it makes it possible to build answers at different levels of detail, according to the user's acquaintance with the domain concepts and actions. For this reason, we have extended Carberry's basic plan recognition algorithm so that, during the dialogue, CMs are completed with information that adapts them to the detail level required by the user's domain knowledge. In other words, CMs have a double role in the representation of discourse: while their upper part represents the active hypotheses on the users' intentions, the lower one, added during the completion phase, specifies which information must be provided for answering in a helpful way. Less detailed CMs are built for expert users, while the actions hypothesized for novices include more information (e.g. the steps in the plan for executing a complex action are added to a CM or omitted, according to the assumption that users know how to perform the action, or they need an explanation of it). For example, referring to Figure .a, if an agent says: ``I want to get the authorization to use the workstations. What should I do?'', the system identifies the GET-AUT-USE-WORKST action. According to the user's degree of acquaintance with the domain, two different answers may be given: the shortest one is to explain that s/he should get a login id. However, if (on the basis of the previous part of the dialogue) the UM suggests that the user knows very little about the CS Department, the procedure for getting the login id could be explained, too. The two situations correspond to different expansions of the CM representing the user's plans. Figure .b shows the CM built in the second situation and the brackets in it divide the part of the CM representing the recognized plans from that added for answering in the appropriate way.

Next: The dialogue interpretation Up: Using Dynamic User Previous: Background

Guido Boella Dottorando
Wed Oct 23 09:42:15 MET 1996