logo

Dialogue Act Markup


Marion Klein (DFKI)


Content

  1. Introduction
  2. Existing Schemes
  3. Best Practice for Dialogue Act Scheme Design
  4. Markup
  5. Other Supported Schemes
  6. Summary

  7.  

      Appendix



  1. Introduction

  2. The scope of this chapter is the level of dialogue acts and its markup. Dialogue acts are also referred to as moves, or illocutionary acts. They mark important characteristics of utterances, indicate the role or intention of an utterance in a specific dialogue, and make relationships between utterances more obvious.

    Base units of dialogue acts are utterances or segments. They are considered as sequences of words. Utterances/segments are sub components of turns which are speaking units of dialogue partners.

    Dialogue act annotation is used for training and testing purposes of NLP systems. One can derive statistical predictions about follow-up dialogue acts based on previous dialogue act annotations and on pattern recognition to improve performance of recognisers. Furthermore, dialogue act language models serve to study intonation, and to analyse referring expressions.

    In the following we summarize the results of scheme comparison in D1.1 to reflect the state of the art on this level. Based on this information we suggest best practice methods for scheme design which fulfil the requirements derived from the comparison. Finally, an example-markup and further supported schemes are stated.
     
     

  3. Existing Schemes

  4. In D1.1 we compared 16 different schemes, developed in the UK, Sweden, the US, Japan, the Netherlands and Germany. Not all of these schemes can be annotated reliably and are suitable for reuse. Therefore some selecting criteria are needed to find out which schemes are appropriate:

    Criteria (short version of D1.1):

    1. Existence of coding book.
    2. Number of annotators.
    3. Level of annotator's expertise.
    4. Number of dialogues annotated.
    5. Evaluation of scheme.
    6. Underlying task.
    7. Languages applied to.
    8. Used in NLP systems.


    The table below gives a brief overview of the comparison. More details can be found here.



     
    schemes
    criteria
    1
    2
    3
    4
    5
    6
    7
    8
    Alparon + 3 experts 500 d.  77% agreem. DES D +
    Chat + huge experts 160 MB - - many -
    Chiba + 10 experts 22 d. 0.57<a<0.68 DIR,BA,TR J ?
    Coconut + 2 experts 16 d. + FUR E +
    Condon & Cech's + 5 fairly exp. 88 d. 91% agreem. TS E +
    C-Star + 5 experts 230 d.  - TR E, J, K, I +
    DAMSL + 4 experts 18 d. k = 0.56 - E +
    Flammia's + 7 trained 25 d. k >=  0.6 DES E ?
    Janus + 4 experts many 89% agreem. BA E +
    Linlin + 4 experts 140 d.  97% agreem. TR,TS S +
    Map Task + 6 experts 128 d. k = 0.83 DIR E +
    Nakatani's + 6 naive 72 d. - INSTR E +
    SLSA + 7 experts 100 d. + COU S +
    SWBD-DAMSL + 9 experts 1155 d. 0.8<k<0.84 - E +
    Traum's + 3 experts 36 d. + - E +
    Verbmobil + 3 naive 1172 d. k = 0.84 BA E, J, G +
    Overview of Scheme Comparison
    Abbreviations:
     
    BA  business appointment
    COU courtroom interaction
    DES directory enquiry service
    DIR giving direction
    FUR furnishing rooms interaction
    INSTR giving instruction (e.g. about cooking)
    TS transport
    TR travel
    D Dutch
    E English
    G German
    I Italian
    J Japanese
    K Korean
    S Swedish

    The results of the comparison are:


    The last item of the list is crucial. At the moment it is not possible to come up with a general list of phenomena for the level of dialogue acts which would suit everybody's requirements. What phenomena a scheme models depends on the research interests of a scheme developer which vary to a great extend.
     

  5. Best Practice for Dialogue Act Scheme Design

  6. Instead of proposing the scheme for the level of dialogue acts we would rather like to point scheme developers to a best practice method as considered by the Discourse Resource Initiative (DRI):
     
    1. Make a list of all phenomena you are interested in and assign a certain characteristic set of tags to each phenomena.
    2. Use this multi-dimensional scheme by yourself, apply it to some training corpora and test reliability.
    3. Flatten the multi-dimensional scheme to a single-dimensional one, by merging tags which always occur together and deleting those which were never used. Remember not to have any extremely small categories as coders tend to overuse them and use natural, easy observable distinctions of tags which are easy to remember.
    4. Provide a coding book for the single-dimensional scheme, including tag set definition, decision tree and example annotations. Further information on how to write coding books can be found here.
    5. Develop a mapping mechanism to convert multi-dimensional annotation to single-dimensional annotation and the other way around.
    6. Randomly check the coding by reliability tests and improve your scheme(s), if necessary.
    7. Document all steps!


    The advantage of this method is that the multi-dimensional scheme supports reusability because it models the different phenomena best, and thus, is easier to understand by foreign scheme developers. The single-dimensional scheme, on the other hand, is easier to apply which speeds-up the annotation process and hence, makes annotation less expensive. For these reasons the flattened scheme is much more appropriate for mass data annotation.
    Of course, a flattened scheme is not necessarily required and if coders are happy with multi-dimensional coding, the time-consuming process of flattening (3.) can be omitted.
     

  7. Markup

  8. The approach in MATE is to reuse the DAMSL scheme as an example for a multi-dimensional scheme and a variant of SWBD-DAMSL as its example flattened counterpart. SWBD-DAMSL was derived from the original DAMSL scheme using the techniques described above. Unfortunately some additional tags were added so that an exact mapping from one scheme to the other is not possible any more. For this reason the MATE SWBD-DAMSL variant omits these additional tags. The following graphics shows the relation between the tag sets, where DAMSL tags are typed in small letters and SWBD-DAMSL tags are typed in capital letters:
     


    For information on the tag set description we want to refer to the coding books of DAMSL and SWBD-DAMSL.

    All dialogue act tags are XML elements and are children of a basic dialogue act unit, called segment. Such a unit is the amount of material that can be attributed one dialogue act / illocutionary function. Each element has as its attributes an unique identifier (id) and segment elements have in addition a reference to the word level (href), the base level for dialogue acts.

    The mapping mechanism from the internal DAMSL scheme to its surface counterpart, a variant of the SWBD-DAMSL scheme, is realised using MATE style sheets. Mapping can be performed in both directions. The relevant stylesheets are presented in the appendix (internal structure to surface structure stylesheet, surface structure to internal structure stylesheet).

  9. Other Supported Schemes

  10. Other schemes which will be supported in the MATE workbench are the Map Task and the Verbmobil scheme. Both schemes are well-documented and successfully reliability tested. They belong to the most well-known schemes and serve in the MATE workbench as successfully used, task-oriented and domain-dependent example schemes. Details about the tagsets of both schemes can be found in their coding-books (Map Task coding book, Verbmobil coding book) and in their coding modules (Map Task coding module, Verbmobil coding module).
     
  11. Summary

  12. Based on the great variety of schemes we compared in D1.1 we presented a method of coding scheme design that supports reusability of schemes and also leads to reliable mass data annotation. We used the DAMSL and a variant of the SWBD-DAMSL scheme to exemplify our proposal. DTDs, example annotations, and conversion stylesheets in the appendix supplement our approach. As additional examples, the Map Task and Verbmobil schemes are implemented in the workbench.
Last Modification: 21.4.1999 by Marion Klein