Communication Problems coding Module


Guideline Coding Module

Name: Guidelines.

Coding purpose: Records the different generic and specific guidelines, the violation of which typically leads to communication problems in a dialogue.

Coding level: Communication problems.

Data sources: List of generic and specific guidelines for co-operative dialogue design.

Module references: None.

Markup declaration:

ELEMENT aspect

ELEMENT guideline
ATTRIBUTES
  aspect: REFERENCE(this, aspect)
  gricean: ENUM (yes|no)
  subsumed_by: REFERENCE(this, guideline)
  abbreviation: TEXT

Description: Two elements are used to annotate the guidelines.  One is aspectaspect is used to indicate a grouping of the guidelines.  For example, the 24 guidelines in Figure 1 are divided into seven groups or aspects.  The element aspect has no explicit attributes.  The mandatory attribute id which is a unique identifier, is always generated automatically for all elements.

A second element is guideline which marks up a particular guideline.  guideline has four attributes.

aspect is mandatory.  It is a reference to the aspect to which the guideline belongs.  The aspect indicated for a specific guideline must always equal the aspect indicated for the generic guideline by which it is subsumed.

gricean is mandatory for guidelines which are the same as Grice's maxims [Grice 1975].  The yes value is used to indicate a maxim.  For non-maxims gricean is optional.  If indicated, the no value must be chosen.  Using the value yes indicates whether a certain guideline is one of Grice's maxims.

subsumed_by should always be used for specific guidelines to indicate by which generic guideline it is subsumed.  subsumed_by cannot be used for generic guidelines.

abbreviation is optional but recommended.  It provides an abbreviated form of the guideline.  It carries the essential meaning and may be easier to remember than the "canonical" expression of the guideline.

Examples:
 

<aspect id="1">Informativeness</aspect>
...
<aspect id="5">Partner asymmetry</aspect>

<guideline id="GG1" aspect="#1" gricean="yes" abbreviation="Say enough">
  Make your contribution as informative as is required (for the current purposes of the exchange).
</guideline>
<guideline id="SG1" aspect="#1" subsumed_by="#GG1" abbreviation="State commitments explicitly">
  Be fully explicit in communicating to users the commitments they have made.
</guideline>
<guideline id="SG2" aspect="#1" subsumed_by="#GG1" abbreviation="Provide immediate feedback">
  Provide feedback on each piece of information provided by the user.
</guideline>
<guideline id="GG2" aspect="#1" gricean="yes" abbreviation="Don't say too much">
  Do not make your contribution more informative then is required.
</guideline>
...
<guideline id="GG10" aspect="#5" abbreviation="Highlight asymmetries">
  Inform the users of important non-normal characteristics which they should take into account
  in order to behave co-operatively in spoken interaction.  Ensure the feasibility of what is
  required of them.
</guideline>
<guideline id="SG4" aspect="#5" subsumed_by="#GG10" abbreviation="State your capabilities">
  Provide clear and comprehensible communication of what the system can and cannot do.
</guideline>
...

Coding procedure:

The guidelines for co-operative dialogue design are part of the coding module for communication problems defined below.  However, they may also be reused in other coding modules for communication problems.  If a user defining a new communication problems module should want to build on a different set of guidelines it may well be that s/he can still reuse the coding module for guidelines defined here.  Encoding a set of guidelines using the present coding module is not very complicated and the following procedure is recommended as sufficient:

1. Encode by coder 1.
2. Check by coder 2.
Creation notes:
Authors: Hans Dybkjær and Laila Dybkjær.
Version: 1 (25 November 1998), 2 (19 June 1999).
Comments: None.
Literature: [Bernsen et al. 1998, Dybkjær 1999].


Violation Types Coding Module

Name: Violation_types.

Coding purpose: Records the different ways in which generic and specific guidelines are violated in a given corpus, i.e. types of problems found in the corpus.  The corpus is implicitly given by a communication problems coding file referring to the problem type coding file as well as to a transcription.

Coding level: Communication problems.

Data sources: List of types of violations of generic and specific guidelines for co-operative dialogue design.  The list is generated during analysis of a corpus with respect to communication problems.

Module references: Module Guidelines.

Markup declaration:
ELEMENT vtype
ATTRIBUTES
  instance_of: REFERENCE(Guidelines, guideline)
  alternative_instances: REFERENCE(Guidelines, guideline+)

Description: Each description of a violation type is annotated by the element vtype. This element has two attributes.

The attribute instance_of is mandatory.  instance_of is a reference to a particular guideline in a file which contains the guidelines for co-operative dialogue.

alternative_instances is optional.  Guidelines overlap and in some cases the coder may be in doubt whether one or the other guideline was violated.  The attribute alternative_instances allows the coder to express this doubt by letting him/her indicate one or more (this is what ‘+’ means) other guidelines than the one referred to by instance_of.

The body of vtype contains the description of the actual type of violation.

Example:
 

<vtype id="SG4-1" instance_of="Guidelines-1999#SG4">
  Too little said on what system can and cannot do: BA often missing;
  time-table enquiries always missing.
</vtype>

Coding procedure: Each communication problem is seen as a certain type of violation of a guideline.  The violation types are highly task dependent.  The file containing these types is built in parallel with the analysis and markup of communication problems.  This file is very special in the sense that its contents, i.e. the text, as well as the markup are created at the same time and by the coder.  The contents are textual descriptions of the violation types.  We recommend to use the same coding procedure for violation types as for markup of communication problems since the two actions are tightly connected.  As a minimum the following procedure should be followed:

1. Encode by coders 1 and 2.
2. Check and merge codings (performed by coders 1 and 2 until consensus).
Creation notes:
Authors: Hans Dybkjær and Laila Dybkjær.
Version: 1 (25 November 1998), 2 (19 June 1999).
Comments: None.
Literature: [Bernsen et al. 1998, Dybkjær 1999].



Communication Problems Coding Module

Name: Communication_problems.

Coding purpose: Records the different ways in which generic and specific guidelines are violated in a given corpus.  The communication problems coding file refers to a problem type coding file as well as to a transcription.

Coding level: Communication problems.

Data sources: Dialogue corpora.

Module references: Module Basic_orthographic_transcription; Module Violation_types.

Markup declaration:

ELEMENT comprob
ATTRIBUTES
  vtype: REFERENCE(Violation_types, vtype)
  wref: REFERENCE(Basic_orthographic_transcription, (w,w)+)
  uref: REFERENCE(Basic_orthographic_transcription, u+)
  caused_by: REFERENCE(this, comprob)
  temp: TEXT

ELEMENT note
ATTRIBUTES
  wref: REFERENCE(Basic_orthographic_transcription, (w,w)+)
  uref: REFERENCE(Basic_orthographic_transcription, u+)

Description: In order to annotate communication problems caused by inadequate systems design we use the element comprob. It refers to some kind of violation of one of the guidelines listed in Figure 1. The comprob element may be used to mark up any part of the dialogue which caused the communication problem.  Thus it may be used to annotate one or more words, an entire utterance or even several utterances in which a communication problem was detected.  The comprob element has five attributes.

The attribute vtype is mandatory.  vtype is a reference to a particular description of a guideline violation in a file which contains the different kinds of violations of the individual guidelines.

Either wref or uref must be indicated.  Both these attributes refer to an orthographic transcription. wref delimits the word(s) which caused a communication problem, and uref refers to one or more entire utterances which caused a problem.

The attribute caused_by is optional.  In some cases a communication problem in a dialogue will be caused by a problem which occurred earlier in that dialogue. caused_by is used to refer to a communication problem which was found elsewhere in the dialogue and which led to the present communication problem.

temp is an optional attribute.  It indicates a temporary markup.  It usually takes a few dialogues before the coder gets a good grasp of the types of guideline violations which tend to occur in the corpus and what caused them.  Often logfile inspection will be needed to make an exact diagnosis.  Moreover, some problems become easier to detect when comparing a few dialogues.  Thus temp is mainly for use during initial markup of a corpus but may also be used later if it is practical to make some temporary notes before making the final diagnosis.  The vtype attribute overrides whatever communication problems the attribute temp indicates

In the beginning of the analysis the vtype attribute may be left open and the temp attribute filled in to describe the kind of guideline violation identified.  Very soon, however, a file containing the violation types should be established and in most cases the temp comments can simply be moved to this file and possibly modified to provide a violation type description.  Note that due to this and to the coding procedure requiring at least two coders the violation type references in the vtype attribute are likely to eventually be re-classified.

The note element can be used anywhere in a corpus to comment on whatever the user wants.  It refers to one or more words or one or more utterances in the same way as the comprob element.  The body of the note element contains text.

Example:
The following example communication problems markup assumes this snippet of a transcription from the Sundial corpus and refers to the example in the violation types coding module:
 

<u id="S1:7-1-sun" who="S">
  flight information british airways good day can I help you
</u>
<comprob id="3" vtype="Sundial_problems#SG4-1" uref="Sundial#S1:7-1-sun"/>
<note id="2" uref="Sundial#S1:7-1-sun">
  The system provides too little information about its capabilities and limitations.
  It is of course an ideal that little information is necessary.
  However, the risk is that the user will be misled and assume stronger or weaker system
  capabilities than are actually present.  Designers should look out for symptoms to this effect.
  The present introduction suggests that users can ask about anything to do with British Airways flights.
  No current system is likely to be able to do that.
  Another interpretation of the system's introduction is that it is owned by British Airways but
  can answer any question about flights.  The former interpretation seems the most natural one.
  So the system's opening probably should not be deemed ambiguous.
</note>

Coding procedure: We recommend to use the same coding procedure for markup of communication problems as for violation types since the two actions are tightly connected. As a minimum the following procedure should be followed:

1. Encode by coders 1 and 2.
2. Check and merge codings (performed by coders 1 and 2 until consensus).
Creation notes:
Authors: Hans Dybkjær and Laila Dybkjær.
Version: 1 (25 November 1998), 2 (19 June 1999).
Comments: For guidance on how to identify communication problems and for a collection of examples the reader is recommended to look at [Dybkjær 1999].
Literature: [Bernsen et al. 1998].