We analyzed five existing schemes in preparation for this proposal. Although in general MATE chose to review only schemes which had been proven reliable, in the case of coreference, reliability tests were rare or informal enough that this constraint was somewhat relaxed. The five schemes reviewed were the MUCSS scheme developed for MUC-7 (Hirschman, 1997), the DRAMA scheme (Passonneau, 1996), the Lancaster University UCREL scheme (Fligelstone, 1992), the scheme developed by Bruneseaux and Romary (1997) and the MapTask annotation of landmarks. These schemes are discussed in MATE deliverable D1.1.
The MUCCS scheme is the best known and most widely used of the existing
coreference schemes, the more modest in scope (it concentrates on identity
relations between NPs) and the only one whose reliability has been systematically
tested. However, this scheme was designed for text, so it does not provide
instructions either for dealing with problems in dialogue such as disfluencies
or misunderstandings, or for annotating references to the visual situation,
common e.g., in the MapTask corpus and in multimodal applications, and
that we hypothesize can be reliably annotated. Also, its syntactic constraint
on markables is designed only for English. The DRAMA scheme was designed
for dialogues and therefore does include instructions for dealing with
some difficult problems of markable identification in dialogues, but still
relies on English-specific syntactic constraints in order to reduce the
annotation task to something doable. DRAMA also includes instructions for
dealing with bridging references - whose reliability however still has
to be ascertained - but not for references to the visual situation. Finally,
the Lancaster scheme was also designed for texts, and in certain ways is
more ambitious than any of the schemes discussed here in that it also contains
instructions for annotating elliptical references. We are not aware of
any study of the reliability of the scheme.