Aligning Text to Audio and Video Using ELAN 1

Andrea Berez

Christopher Cox

https://web.archive.org/web/20100615075404/http://logos.uoregon.edu/infield2010/workshops/aligning-text-elan1/index.php

Course Information

This is a two-part workshop on the use of ELAN software, which enables the creation of archival-quality time-aligned transcriptions of audio and video. The Level 1 workshop is an introduction to the basic functions of ELAN, including conceptualizing transcripts and step-by-step instructions for building annotation files of different levels of complexity. The Level 2 workshop will show students how to integrate ELAN into typical language documentation workflows, including transforming time-aligned texts into friendly presentation formats that are suitable for language learners and speakers, such as subtitled DVDs and web-based formats using tools like CuPED.

Instructor(s) Bio

Andrea Berez is a doctoral candidate in the linguistics department at the University of California, Santa Barbara. She is a descriptive and documentary linguist who works primarily with speakers of Ahtna and Dena’ina, two endangered Athabascan languages of south-central Alaska. Her linguistic interests include intonation, spatial cognition and discourse-functional approaches to grammar. She is also interested in the development of the technological infrastructure to support language documentation and archiving.

Christopher Cox is a doctoral student in the Department of Linguistics at the University of Alberta.  His research centres on language documentation and description and corpus and computational linguistics, concentrating upon the collaborative development of permanent collections of language resources for both community and academic use.  He has been involved in language documentation efforts with speakers of Tsuut'ina, an endangered language of southern Alberta, and of Plautdietsch, the traditional language of the Dutch-Russian Mennonites, and currently serves as the moderator of the "Plautdietsch-L" community listserv.

About ELAN

ELAN is a professional tool for the creation of complex annotations of audio and video resources. Creating transcripts in ELAN has several advantages for the language documentation workflow. First, it creates an archival XML document that links your annotations (text) to the timeline of the media in a way that is long-lasting and not reliant on proprietary software for recovery (meaning that your transcription will be available well into the future). Second, ELAN is flexible enough to be used when you have a recording of one speaker, or five speakers, or several languages at once, etc. Third, ELAN allows import from and export to a range of other popular linguistic software and format (like Transcriber, Toolbox, CHILDES, etc.). Fourth, ELAN files can be used in the creation of pedagogical and presentation products for language maintenance and revitalization. ELAN is highly specialized software, and it can take a while to learn how to set up your files. This workshop will help you climb over the learning curve, and youʼll see that ELAN isnʼt difficult once you know how to use it!

**The Level 1 workshop assumes students have very little or no experience with ELAN. The Level 2 workshop assumes students have at least a little experience with ELAN (but the Level 1 workshop is not required).**

Course overview

In this class we will learn how to use the ELAN software to link transcriptions to audio and video media. We will talk extensively about how to conceptualize tiers (layers of text, linguistic analysis, and translations) so that you can create annotations that are flexible enough for a range of configurations of speakers and languages. Day-by-day schedule for Level 1 workshop:

Day 1 (Tuesday June 22)

  • Introduction: a tour of ELAN

  • Starting up, file naming, file storage

  • Creating a single-language annotation file with one speaker

  • Creating a single-language annotation file with two speakers

Day 2 (Wednesday June 23)

  • Review of yesterday

  • Creating a two-language simple annotation (sentence-level)

Day 3 (Thursday June 24)

  • Creating a two-language complex annotation (full interlinearized glossed text)

Day 4 (Monday June 28)

  • Working with videos

  • Some advanced features of ELAN: Working with templates, gaps, silence

  • Beyond ELAN: Importing from and exporting to other useful file types

  • Brainstorming for your own projects

  • File storage during the class:

There is no central shared server space where you can store your files, so you will need to keep everything on your thumb drive. Please be sure to save all your files and keep them, because you may want to see them later.

Previous
Previous

Aligning Text to Audio and Video Using ELAN 2

Next
Next

Principles of Database Design