Next: Example Up: Indirect Speech Acts and Previous: The representation of

The speech act recognition process

Communicative actions should be interpreted at three levels: the phatic level, referring to the understanding of the single words uttered by the speaker, the locutionary level, referring to the comprehension of the meaning of the utterances, and the illocutionary level, referring to the interpretation of the sentences as speech acts. While we are not concerned with the phatic level, in our framework the locutionary and illocutionary levels correspond to different phases of analysis of the input sentences. In particular, a NL interpreter (Ardissono etal, 1991; DiEugenio & Lesmo, 1987; Lesmo & Terenziani, 1988) carries on the syntactic-semantic analysis and produces the semantic representation (in the formalism of semantic nets); then, the identification of the speech act is performed (this is the main topic of our paper). Finally, the domain-related processing connects the sentence to the previous ones in a single picture of the overall domain plans and goals of the speaker (see Figure 3). These plans are represented by means of hierarchical structures based on the domain level of the plan library and are obtained by applying heuristic rules for action identification and focusing; these rules keep into account contextual information for building coherent hypotheses on the speaker's goals and plans (Ardissono etal, 1993; Carberry, 1988).

The input to the second phase (see Figure 3) is a semantic representation of the input (with the contextual - e.g. anaphorical - references already solved) and its output is the recognized speech act, i.e. one of the roots of the hierarchies depicted in the figures. As a side effect of this second step, all ``politeness indicators" have been identified, so that just the ``pure'' propositional content of the input sentence is passed to the third step. Concurrently, a degree of politeness has been evaluated. The goal of this section is to describe how the second step extracts the politeness indicators; nothing will be said about the evaluation of the politeness degree, which is currently obtained via some simple and not yet well developed heuristic rules.

: Schema of the interpretation process

The basic claim is that the whole process is governed by standard plan management procedures: the same procedures used in the third step for the well known domain-dependent analysis of the user's plans and goals.

First, the semantic representation undergoes an action-identification phase. Since the interpreter is playing at the locutionary level, this phase does not return the main action (as expressed by the main verb) involved in the input, but the surface speech act type (e.g. surface-yn-question). This seems reasonable, since, at this level, the term `act' must refer to locutionary acts. The surface type is used as an entry point in the hierarchy, since it must match one of the leaves. Then, starting from the leaf found, an upward-expansion procedure is applied. Again, this procedure is the same used within the focusing phase of domain-level analysis (Ardissono etal, 1993; Carberry, 1990). Upward-expansion climbs up the hierarchy along all possible paths (and this can lead to ambiguities).

The key point is the treatment of the wh conditions appearing in the nodes of the hierarchy. Most of them refer to standard tests, but there are two types that deserve attention. The first of them is the check of feature(sem); these tests are encoded in a very compact way in the figures; what actually happens is that each of them asks for the inspection of the top-most current node of the semantic representation; if the features mentioned in the test are found, then the node is discarded ( f-cancel) and the `main' substructure remains as sem (e.g., with modal verbs, the main substructure is the one referring to the `object' of the proposition; for example, given a sentence like ``May I ask you to ...'' and its semantic representation ``May(User, ask(User, System, ...))'' sem1, after a can1 test on the formula, the remaining part is ``ask(User, System, ...)'', that corresponds to ``User asks system to ..." sem). So, when the hierarchy is climbed up, the politeness markers disappear and, when one of the roots is reached, what remains is the propositional nucleus of the input sentence. The complete process could require that the root is reached more than once. In fact the process stops only when a root has been reached and no further climbing up is possible. But for nested levels of indirectness, the root can be used as a new entry point in the hierarchy (see the bottom ask-if node in the figures). Actually, the process can also fail in case a non-root node has no parent for which the wh conditions are met. Hopefully, in this case other alternative paths remain open.

Note however that, given a certain speech act, it is possible to identify more than one primary illocutionary act; so, the upward activation of the actions in the speech acts hierarchy may generate alternative hypotheses. For example, sentence 1b can be interpreted as a request to have the keys (indirect interpretation) or as an attempt to obtain some information about the capabilities of the hearer. The two interpretations correspond, respectively, to the activation, while moving upward on the speech act hierarchy, of the request and obtain-info actions.

The second special test concerns the act-id predicate (see, for instance, the on-record-req node in Fig 1). This prepares the work for the third step (domain-level analysis). As stated above, the output of the speech-act analysis is the recognized speech-act. However, some speech-acts refer to an actual domain action; for instance, a request expresses the intention that the hearer does something, and that something is a domain action that must be encoded within the request (note that this is not the case for obtain-info). The speech-act hierarchy specifies this ``type coercion" among levels: a surface imperative has as argument a semantic representation, while a request has, as argument, the corresponding domain action. Procedurally, this means that the usual action-identification procedure is executed, so that its role in the overall processing is made explicit in the hierarchy.

Next: Example Up: Indirect Speech Acts and Previous: The representation of

Guido Boella Dottorando
Thu Oct 31 15:35:12 MET 1996