-
Introduction
The scope of this chapter is the level of dialogue acts and its markup.
Dialogue acts are also referred to as moves, or illocutionary acts. They
mark important characteristics of utterances, indicate the role or intention
of an utterance in a specific dialogue, and make relationships between
utterances more obvious.
Base units of dialogue acts are utterances or segments. They are considered
as sequences of words. Utterances/segments are sub components of turns
which are speaking units of dialogue partners.
Dialogue act annotation is used for training and testing purposes of
NLP systems. One can derive statistical predictions about follow-up dialogue
acts based on previous dialogue act annotations and on pattern recognition
to improve performance of recognisers. Furthermore, dialogue act language
models serve to study intonation, and to analyse referring expressions.
In the following we summarize the results of scheme comparison in D1.1
to reflect the state of the art on this level. Based on this information
we suggest best practice methods for scheme design which fulfil the requirements
derived from the comparison. Finally, an example-markup and further supported
schemes are stated.
-
Existing Schemes
In D1.1 we compared 16 different
schemes, developed in the UK, Sweden, the US, Japan, the Netherlands and
Germany. Not all of these schemes can be annotated reliably and are suitable
for reuse. Therefore some selecting criteria are needed to find out which
schemes
are appropriate:
Criteria (short version of D1.1):
-
Existence of coding book.
-
Number of annotators.
-
Level of annotator's expertise.
-
Number of dialogues annotated.
-
Evaluation of scheme.
-
Underlying task.
-
Languages applied to.
-
Used in NLP systems.
The table below gives a brief overview of the comparison. More details
can be found here.
|
schemes
|
criteria
|
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
| Alparon |
+ |
3 |
experts |
500 d. |
77% agreem. |
DES |
D |
+ |
| Chat |
+ |
huge |
experts |
160 MB |
- |
- |
many |
- |
| Chiba |
+ |
10 |
experts |
22 d. |
0.57<a<0.68 |
DIR,BA,TR |
J |
? |
| Coconut |
+ |
2 |
experts |
16 d. |
+ |
FUR |
E |
+ |
| Condon
& Cech's |
+ |
5 |
fairly exp. |
88 d. |
91% agreem. |
TS |
E |
+ |
| C-Star |
+ |
5 |
experts |
230 d. |
- |
TR |
E, J, K, I |
+ |
| DAMSL |
+ |
4 |
experts |
18 d. |
k = 0.56 |
- |
E |
+ |
| Flammia's |
+ |
7 |
trained |
25 d. |
k >= 0.6 |
DES |
E |
? |
| Janus |
+ |
4 |
experts |
many |
89% agreem. |
BA |
E |
+ |
| Linlin |
+ |
4 |
experts |
140 d. |
97% agreem. |
TR,TS |
S |
+ |
| Map Task |
+ |
6 |
experts |
128 d. |
k = 0.83 |
DIR |
E |
+ |
| Nakatani's |
+ |
6 |
naive |
72 d. |
- |
INSTR |
E |
+ |
| SLSA |
+ |
7 |
experts |
100 d. |
+ |
COU |
S |
+ |
| SWBD-DAMSL |
+ |
9 |
experts |
1155 d. |
0.8<k<0.84 |
- |
E |
+ |
| Traum's |
+ |
3 |
experts |
36 d. |
+ |
- |
E |
+ |
| Verbmobil |
+ |
3 |
naive |
1172 d. |
k = 0.84 |
BA |
E, J, G |
+ |
Overview of Scheme Comparison
Abbreviations:
| BA |
business appointment |
| COU |
courtroom interaction |
| DES |
directory enquiry service |
| DIR |
giving direction |
| FUR |
furnishing rooms interaction |
| INSTR |
giving instruction (e.g. about cooking) |
| TS |
transport |
| TR |
travel |
|
|
| D |
Dutch |
| E |
English |
| G |
German |
| I |
Italian |
| J |
Japanese |
| K |
Korean |
| S |
Swedish |
The results of the comparison are:
-
Coding books are provided for all schemes so that they can be used by other
coders.
-
All schemes seem to be applicable as they were applied to corpora of reasonable
size by several annotators. Additionally, most of the schemes are used
in NLP systems.
-
Most schemes seem to be difficult to use as annotators were trained or
experts.
-
Schemes which state inter-coder agreement show intermediate to good results
and are reliable.
-
Most of the schemes are hard to reuse in a different context because they
are domain/task or language dependent.
The last item of the list is crucial. At the moment it is not possible
to come up with a general list of phenomena for the level of dialogue acts
which would suit everybody's requirements. What phenomena a scheme models
depends on the research interests of a scheme developer which vary to a
great extend.
-
Best
Practice for Dialogue Act Scheme Design
Instead of proposing the scheme for the level of dialogue acts
we would rather like to point scheme developers to a best practice method
as considered by the Discourse Resource Initiative (DRI):
-
Make a list of all phenomena you are interested in and assign a certain
characteristic set of tags to each phenomena.
-
Use this multi-dimensional scheme by yourself, apply it to some training
corpora and test reliability.
-
Flatten the multi-dimensional scheme to a single-dimensional one, by merging
tags which always occur together and deleting those which were never used.
Remember not to have any extremely small categories as coders tend to overuse
them and use natural, easy observable distinctions of tags which are easy
to remember.
-
Provide a coding book for the single-dimensional scheme, including tag
set definition, decision tree and example annotations. Further information
on how to write coding books can be found here.
-
Develop a mapping mechanism to convert multi-dimensional annotation to
single-dimensional annotation and the other way around.
-
Randomly check the coding by reliability tests and improve your scheme(s),
if necessary.
-
Document all steps!
The advantage of this method is that the multi-dimensional scheme
supports reusability because it models the different phenomena best, and
thus, is easier to understand by foreign scheme developers. The single-dimensional
scheme, on the other hand, is easier to apply which speeds-up the annotation
process and hence, makes annotation less expensive. For these reasons the
flattened scheme is much more appropriate for mass data annotation.
Of course, a flattened scheme is not necessarily required and if coders
are happy with multi-dimensional coding, the time-consuming process of
flattening (3.) can be omitted.
-
Markup
The approach in MATE is to reuse the DAMSL scheme as an example for
a multi-dimensional scheme and a variant of SWBD-DAMSL as its example flattened
counterpart. SWBD-DAMSL was derived from the original DAMSL scheme using
the techniques described above. Unfortunately some additional tags were
added so that an exact mapping from one scheme to the other is not possible
any more. For this reason the MATE SWBD-DAMSL variant omits these additional
tags. The following graphics shows the relation between the tag sets, where
DAMSL tags are typed in small letters and SWBD-DAMSL tags are typed in
capital letters:
-
communicative-status
-
information-level
-
forward-looking-function
-
backward-looking-function

For information on the tag set description we want to refer to the
coding books of DAMSL
and SWBD-DAMSL.
All dialogue act tags are XML elements and are children of a basic dialogue
act unit, called segment. Such a unit is the amount of material that can
be attributed one dialogue act / illocutionary function. Each element has
as its attributes an unique identifier (id) and segment elements have in
addition a reference to the word level (href), the base level for dialogue
acts.
The mapping mechanism from the internal DAMSL scheme to its surface
counterpart, a variant of the SWBD-DAMSL scheme, is realised using MATE
style sheets. Mapping can be performed in both directions. The relevant
stylesheets are presented in the appendix (internal
structure to surface structure stylesheet, surface
structure to internal structure stylesheet).
-
Other Supported
Schemes
Other schemes which will be supported in the MATE workbench are the
Map Task and the Verbmobil scheme. Both schemes are well-documented and
successfully reliability tested. They belong to the most well-known schemes
and serve in the MATE workbench as successfully used, task-oriented and
domain-dependent example schemes. Details about the tagsets of both schemes
can be found in their coding-books (Map
Task coding book, Verbmobil
coding
book) and in their coding modules (Map
Task coding module,
Verbmobil coding module).
-
Summary
Based on the great variety of schemes we compared in D1.1 we presented
a method of coding scheme design that supports reusability of schemes and
also leads to reliable mass data annotation. We used the DAMSL and a variant
of the SWBD-DAMSL scheme to exemplify our proposal. DTDs, example annotations,
and conversion stylesheets in the appendix supplement our approach. As
additional examples, the Map Task and Verbmobil schemes are implemented
in the workbench.