Signbank

The Auslan Corpus annotation files

At present, 357 movies in the Auslan Corpus have annotation files containing annotations at various levels of detail. Annotations are being added to the corpus all the time. The current annotation files have one or more of the following types of annotations:

identification and IDglossing of nouns and verbs only
sign tokenization and IDglossing for all signs
tagging for sign grammatical class ("part of speech")
identification of gaze direction during points
identification of palm orientation during points
identification of clause boundaries
identification of verb arguments
tagging of verb arguments for macro-roles and semantic roles
tagging for the presence or absence of spatial modification
the identification of periods constructed action ('role shift')
free translation
literal translation.

The amount of time required for the annotation of signed language texts is enormous and it is anticipated that it will take many years before the Auslan archive becomes sufficiently richly annotated (and hence machine-readable) and qualifies as a true linguistic corpus.

Value-adding the movies in the archive with annotations is time consuming and expensive. These annotation files are not publicly available but will be made to fellow researchers on requests on a data-sharing and data-enrichment basis (i.e., access to existing annotation files will be granted on condition that enriched annotation files are returned to the corpus). Research collaboration is also encouraged.

Click here for a copy of the guidelines used to create the annotations for the Auslan Corpus as it now exists. (Last updated November 2024.)