Import and Export options

Export as

ELAN offers various export options. To export, click on File > Export As and one of the options.

Apart from these export options for single files, ELAN also supports multiple file exporting options. More details regarding these options can be found here: Multiple file export options

How to select tiers

Figure 1.32. Tier Selection panel in most of the dialogs

Tier Selection panel in most of the dialogs


Different ways to select tiers :

  • By Tier Names

    Select the tiers by checking the boxes before each tier name.

  • By Type

    This tab shows a list of the tier types available in the current transcription. Select the types by checking the boxes before each type name. Selecting the types will select all the tiers of the each selected types. To modify the selected tiers switch back to By Tier Names.

  • By Participant

    This tab has a list of all the participants in the transcription. Select the participants by checking the boxes before each type name. Selecting the participants will select all the tiers of the each selected participants. To modify the selected participant switch back to By Tier Names.

  • By Annotators

    This tab has a list of all the annotators in the transcription. Select the participants by checking the boxes before each annotator name. Selecting the annotators will select all the tiers of the each selected annotators. To modify the selected tiers switch back to By Tier Names.

  • By Languages

    This tab has a list of all the languages in the transcription. Select the language(s) by checking the boxes before each language name. Selecting the languages will select all the tiers of the each selected language. To modify the selected tiers switch back to By Tier Names.

Note

To select multiple tiers, press Shift and click on the successive tiers or click and drag the mouse along the tiers to select them

Other options :

  • To sort the selected order of tiers use the and buttons to move the tiers up and down in the table.
  • Show only root tiers : Check this option to show only the root tiers in the transcription.
  • Select All : click this button to select all the boxes in the current tab.
  • Select None : click this button to de-select all the boxes in the current tab.
  • OK : click on OK to select the tiers
  • Close : click to close the dialog or cancel the changes

Toolbox file(UTF-8)

Similar to exporting a document to Shoebox (see Shoebox file) ELAN data can be exported to a Toolbox document with an UTF-8 encoding. This export provides more options for output customization.

To export a file into Toolbox, do the following:

  1. Click on File menu.
  2. Click on Export as > Toolbox File (UTF-8)...

    The Toolbox Export dialog box appears:

    Figure 1.33. Toolbox Export dialog window

    Toolbox Export dialog window


    Only the left part of ELAN tier names containing an @ are identified as tier markers for Toolbox. These markers form a block in the exported file. The right part of the ELAN tier names are identified as participant names. These are exported with the marker ELANParticipant see the figure below:

    Figure 1.34. ELAN file and exported Toolbox file

    ELAN file and exported Toolbox file


    If you use a Shoebox *.typ file to specify the Toolbox database type ELAN extracts the database type name from the first line of the type file (e.g. the database type name Text in \+DatabaseType Text) and puts is in the first line of the exported file (e.g. \_sh v3.0 400 Text).

    When there is only one root tier (tier without a parent tier) in the transcription (e.g. ref) this will be used as the record marker by default. When there are multiple root tiers "\block" will be added as record marker. In both cases it is possible to specify a custom record marker instead.

    Some options not touched up in Figure 1.33, “Toolbox Export dialog window”:

    • By first selecting a tier(How to select tiers) and then selecting Insert blank line after this marker you insert a blank line after the selected marker every time the marker is printed in the exported file. The tier name is colored blue in the dialog box.
    • By selecting Wrap block you can let ELAN wrap a whole block if one of the lines in a block is longer than a specified number of characters (default is 80 characters). A block in this context refers to the markers that are part of the interlinearization.
    • When Wrap blocks is selected it is also possible to select Wrap lines. This applies to long marker lines that are not part of the interlinearization. There are 2 variants: when Wrap to next line is selected the line is split into 2 or more lines that immediately follow each other, regardless of their position in the record. When Wrap to end of block is selected everything beyond the first wrap is placed at the end of the record. Note that wrapped interlinearization blocks are grouped as much as possible.
    • When Include empty markers is selected all markers will be printed in each record, whether there is content or not. When this option is not selected a marker will not be printed in a record when it has no content.
    • By selecting Add master media time offset to annotation times you can add to the annotation times the time offset from the master media that originated from the synchronization of media files (see Synchronizing video files).

    Make a choice and click on OK to continue.

  3. Specify the name and directory of the exported file.
  4. Click Save to export the file; otherwise click Cancel to exit the dialog box without exporting the file.

    The file is exported as a *.txt | *.sht | *.tbt file.

    If there already exists a file of the same name, ELAN will ask you whether or not it should overwrite the existing file.

  5. Open the exported file in Toolbox.

    It contains the following information:

    1. All tiers and annotations.

      Each ELAN parent annotation (including all its referring annotations) corresponds to one Toolbox record. E.g., in the illustration below, the ELAN parent annotation “CLLDCh3R02S01.001” corresponds to the Toolbox record “CLLDCh3R02S01.001”.

    2. The time code information for each parent annotation.

      Each ELAN parent annotation (i.e., each Toolbox record) contains the additional field markers \ELANBegin and \ELANEnd (i.e., the begin and end time of the parent annotation).

      This time code information allows you to import the Toolbox file back into ELAN, without having to manually re-align the file (see Shoebox file).

FLEx files

ELAN allows you to export your project to the SIL Fieldwork Language Explorer software, also referred to as FLEx. The data exchange is realized through .flextext files, a file type that defines several container elements and attributes (see below), onto which ELAN's tiers (via their tier type) and annotations have to be mapped. For the configuration of these mappings the complex, multiple step export window described below, is provided. Configuration will be less complicated in case the .eaf was created by importing a FLEx .flextext file. On import, some FLEx attributes are "encoded" in the names of tiers, on export these attributes are reconstructed by "decoding" the tier names. To better understand the options in the user interface, a simplified representation of the structure of a .flextext file follows here.

              <interlinear-text>
                <item lang="" type="">...</item>
                <paragraph>
                  <phrase>
                    <item lang="" type="">...</item>
                    <word>
                      <item lang="" type="">...</item>
                      <morph type="">
                        <item lang="" type="">...</item>
                      </morph>
                    </word>
                  </phrase>
                </paragraph>
              </interlinear-text>
            

All elements can occur multiple times, e.g. there can always be multiple item child elements for any parent element.

Note

If your .eaf file contains multiple participants, make sure you have given each participant a name value. You can set a participant value under Tier > Change Tier Attributes....

Choosing File > Export as > FLEx file … will give you the following screen:

Figure 1.35. Export FLEx file step 1

Export FLEx file step 1


In this screen you can specify:

  • which tier type corresponds to which FLEx element
  • which tiers should be included in the export
  • with the Export interlinear-text tier option, if there is a tier corresponding to the interlinear-text element and, if so, which tier it is. This determines whether a tier and its dependent tiers provide the contents for item child elements of interlinear-text.
  • with the Export paragraph tier option, if there is a tier corresponding to the paragraph element. If so, its segmentation is used for grouping phrase child elements, if not, each phrase will be embedded in its own paragraph element.

Figure 1.36. Export FLEx file step 2

Export FLEx file step 2

The second screen allows to:

  • map tier types to the item child element of the correct, corresponding container element
  • specify which tiers should be exported as that item
  • specify with the Select a tier type for 'morph-type' tiers option, which tier type provides the value for the type attribute of the .flextext morph element. This should be a valid FLEx morph type. If this option is deselected each morph element will be exported with attribute type="root".

Figure 1.37. Export FLEx file step 3

Export FLEx file step 3

The third screen allows to customize the FLEx lang (language) and type attributes output:

  • the upper part of the screen contains a table and two radio buttons. The buttons enable to switch between tiers and tier types mode (the latter is preferred). The contents of the table is updated after a change in choice. The value of each cell in the type and language column can be selected from a pull-down menu.
  • the lower part of the screen allows to edit the list of values selectable in those pull-down menus. The type and language radio buttons determine which list is being updated by either adding new values or removing existing values. The list for type is based on a FLEx controlled vocabulary, which could be out-of-date at the time of use, therefore new values can be added manually. The list of languages currently is based on "decoding" the tier names and on the content languages of the tiers. The list can be empty, it should be filled manually in that case.

    Note

    FLEx requires that for languages that have both a two letter ISO 639-1 code and a three letter ISO 639-3 code, the two letter code should be used. This is not enforced by the export function.

  • For more information on the structure of FLEx, see Figure 1.67, “FLEx to ELAN structure”.

The final screen allows you to save the file as a flextext file, so it can be used in FLEx.

Note

On the third-party resources page of ELAN (https://tla.mpi.nl/tools/tla-tools/elan/thirdparty/ ), you can find a teaching-set which covers the aspects of importing from FLEx to ELAN and back to FLEx.

CHAT files

  1. Choosing File > Export as > CHAT file … will give you the following screen:

    Figure 1.38. Export Chat file

    Export Chat file


  2. Fill in the necessary fields.

    Note

    Chat labels must be preceded by * (for root tiers) or % (for dependent tiers). While root tiers have to contain exactly 3 characters, dependent tier names can have up to 7 characters.

  3. Click on Export…
  4. Fill in a chat file name and choose Save.

Tab-delimited text file

All documents can be exported into a tabular format for purposes of further analysis and/or printing. This includes documents that were created by ELAN itself (see Creating a new document and Opening an existing document) as well as documents that were imported into ELAN from any of the supported formats. Do the following:

  1. Click on File menu.
  2. Click on Export as > Tab-delimited Text ….

    The Export as tab-delimited text dialog window is displayed, e.g.:

    Figure 1.39. Export as tab-delimited text dialog window

    Export as tab-delimited text dialog window


    1. Select the tiers to be exported. ( How to select tiers)
    2. Select to export a selected time interval only.
    3. Add time offset from the master media to the annotation times.
    4. Include header lines with media file location info, include the tier and/or participant names from the output file
    5. Annotations sharing the same begin and end time are exported in the same row.
    6. Select to include the description of the controlled vocabulary.
    7. Select time information and format.
    8. Add extra time format expressed in hours, minutes, seconds and frame.

  3. By default, ELAN exports all annotations, but it is possible to restrict the export process to selected annotations. The following three options are available:
    1. Export only those annotations that correspond to a selected time interval. Do the following:
      1. In the ELAN window, select the desired time interval (see Making a selection on an independent tier).
      2. In the Export as tab-delimited text dialog window, click in the box to the left of Restrict to selected time interval. A check mark appears indicating that this option has been selected.
    2. Export only those annotations that are contained on particular tiers. Do the following:

      In the Export as tab-delimited text dialog window, select those tiers that you want to export. A check mark appears next to any selected tier.

    3. Export only those annotations that (a) correspond to a particular time interval and (b) are contained on particular tiers. To do this, combine the two steps under (a) and (b) above.

    By default, the output contains one annotation per row, with the tier name in one of the columns, time information in several following columns and then the annotation value.

  4. By selecting Add master media time offset to annotation times you can add to the annotation times the time offset from the master media that originated from the synchronization of media files (see Synchronizing video files).
  5. The option Include header lines containing media file information allows you to add the media-file path information for each media file to the header of the exported file.
  6. The option Separate column for each tier gives each tier its own column in the export file. Annotations that have the same begin time and the same end time are exported in the same row i.e. the same tab-delimited line. The following options allow to also have annotations in the same row if they are not fully aligned but do overlap. As a consequence each annotation can be in the output more than once, making annotation counts unreliable.
    • If you check Repeat values of annotations spanning other annotations the spanning annotation is put in each row containing an annotation it spans. The spanning annotation is not in a row by itself.
    • The option Only repeat within annotation hierarchies limits the previous option. An annotation is only repeated if it is on one of the ancestor tiers in the annotation hierarchy.
    • The option Sliced annotation output showing temporal co-occurrences is an alternative way to repeat annotation values based on overlaps. In this export all unique begin and end times of all annotations in the export are placed in one list, creating new intervals (between each two successive time values). Each interval is exported if there is at least one annotation overlapping that interval and in the column of each tier the value of the overlapping annotation, if any, is exported.
    • The option Include the annotation id appends the annotation identifier between brackets to the annotation value (e.g. [a13]). This makes it possible to distinguish annotations in the output, which is hard to do in the case of repeated values.
  7. Select the time markers you want to export (begin time, end time and/or duration of every annotation unit).
  8. Choose the time format (hh:mm:ss.ms, ss.msec, milliseconds and/or SMPTE time code)

    Note

    If you choose the SMPTE (hh:mm:ss.ff) format, the selected video standard (PAL or NTSC) just indicates the way seconds and milliseconds are converted to frame numbers. This is independent of the actual video standard of the associated video(s).

  9. Click OK to start the export process; otherwise click Cancel to exit the dialog box without exporting the annotations.
  10. Finally you will see a save dialog window. In the Encoding drop down box a text encoding can be selected (either ISO-latin, UTF-8 or UTF-16). In the file format box there are two options, *.txt saves a tab-delimited text file, *.csv saves the annotations in a comma separated values file, placing all text values between double quotes. Make an appropriate choice and click on Save.

    Note

    Some Mac applications, like TextEdit, have difficulties to load UTF-8 encoded files. This is most noticeable for “special” characters, e.g. IPA. Using UTF-16 is recommended in that case.

    A message appears to inform you that the file has been exported.

    The contents and the layout of the exported file depends on the selected options. It can be opened with any program that can handle tab-delimited or comma separated texts, e.g., Microsoft Excel.

    Figure 1.40. Tab-delimited text

    Tab-delimited text


    Note

    Some versions of Excel seem to have problems importing tab-separated files (white rectangles are shown instead of the column borders). As a workaround you can open the text file first in a text editor (e.g. Notepad) and copy and paste the content into Excel.

Tiger XML

If your ELAN annotations contain syntactic elements, it is possible to export these to Synpathy[2] (see https://tla.mpi.nl/tools/tla-tools/older-tools/synpathy/). This function is available via File > Export as > Tiger-xml…

First select out of the candidate tiers the one you want to be exported. Afterwards, map the tiers onto the correct description ("word" or "pos"). Finally enter the name of the file (*.tig).

Interlinear text file

This function (File > Export as > Interlinearized Text...) is very similar to ELAN’s printing system. Therefore more information can be found in Previewing the printed pages. The main difference is that the width of the exported text depends in this case on the number of characters that fits on one line.

Figure 1.41. Maximum line width

Maximum line width


After selecting an appropriate layout click on Save as and choose a location and file name. These files can afterwards easily be edited with any text editor (preferably using a fixed-with font). Optionally tick the Insert tabs between annotations box if you prefer to have the white space between annotations to be filled with tabs instead of spaces (especially useful when importing a text file into Word). If Insert tabs between annotations is selected, you could also have single tab instead of multiple white spaces. To do that tick Tabs Instead of Spaces box if you prefer to have tabs instead of multiple white spaces.

HTML file

Similarly to the export to interlinear text (see Interlinear text file) you can also export annotations to a HTML file, through the File > Export as > HTML... menu.

Figure 1.42. Export as HTML

Export as HTML


The only extra option for the HTML export is

  • Play media : Check this option to play the media file in the exported html file.

    Note

    To play the media HTML 5 is required. It is necessary to place the exported html in the same location as the media file in order to play the file from the html export.

Traditional transcript files

In some situations a straight-forward list of the annotation units, one after another, can be handy. For that cause an export option to a “traditional transcript text” has been added to ELAN. In its simplest form it just will create a text file containing the successive annotations of several tiers, in chronological order. This feature can be found under File > Export as > Traditional Transcript Text....

Figure 1.43. Export Transcript Text

Export Transcript Text


"Restrict to the selected time interval' allows you to export only the data that is currently selected. (see Making a selection on an independent tier).

'Wrap lines' sets a maximum number of characters before the line gets wrapped.

'Merge annotations on the same tier...' makes it possible to merge annotations on the same tier if the gap in between these annotations is less than a certain amount of milliseconds.

You can number the annotations, each wrapped line, and include or exclude tier labels or participant labels in the export.

One of the options enables you to include silences with a minimal duration. The figure shows there is a silence of 0.2 seconds between 'yeah' on the tier K-Spch and 'and then you go the other ...' on the tier W-Spch. The first annotation ends at 00:00:04.400 seconds and the next annotation begins at 00:00:04.600 seconds, resulting in a silence of 0.2 seconds. If this silence was shorter than the minimal silence duration entered in the export dialog window (20 ms in the figure), the silence will not be included in the exported file. The silence duration indication can have 1, 2 or 3 numbers of digits after the decimal.

Empty lines after each annotation (block) can also be included or excluded in the generated output file. Lastly, you can set a fixed width (in number of characters) for the tier labels.

The option to use Jefferson-style alignment based on "[" characters in overlapping annotations, can change the position of parts of annotations by vertically aligning corresponding "[" characters. (Alignment of matching "]" characters is not supported yet.)

Time-aligned Interlinear Text

This export function (File > Export as > Time-aligned Interlinear Text...) produces interlinear output but, unlike standard Interlinear Gloss, the formatting is based on time alignment . This is achieved by using a monospaced (fixed width) font in combination with a customizable character-to-milliseconds calculation factor. As a consequence, depending on this factor, the export might cut off part of the annotation value.

The export offers a few text styling options (underline, bold, italic) and the output format is (simple) HTML.

Figure 1.44. Export settings

Export settings

The ouput can be customized in various ways:

  • In the top right area of the window is the usual Tiers selection panel. But with additional columns that allow to specify a style per tier. The font style options are underline, bold and italic.
  • The remainder of the right area of the window, the "How" panel, contains options to further customize the output:
    • Time Unit the value entered here determines the number of milliseconds one character represents.
    • Block Space this is the width of the text block in number of characters. This does not include the margin.
    • Left Margin the number of characters for the tier labels.
    • Font Size the font size to use for the output.
    • Restrict to selected time interval this allows to export only the selected fragment instead of the entire transcript.
    • Use Reference Tier when a reference tier is selected, the annotations of this tier are exported, together with overlapping annotations on other selected tiers.
    • Wrap Within One Block when a reference tier is used, this option determines whether or not line wrapping is performed within a block. Without wrapping the block width may exceed the specified block space.
    • Display annotation values left aligned by default annotations are exported right aligned, with this option the output is left aligned.
    • Show annotation boundaries with this option the begin and end boundary of annotations are marked with "[" and "]" characters.
    • Show time and timeline with this option a kind of timeline, in text, is added to the output.
  • The left half of the screen shows a preview of the output based on the current settings.

After changes in settings the Apply Changes button updates the preview. The Save As... button starts the actual export, currently html is the only supported format.

Figure 1.45. Different export settings

Different export settings

Praat TextGrid file

When you wish to work with your annotations in Praat, ELAN enables you to export your annotation to a Praat TextGrid. To do this, click File > Export as > Praat TextGrid.... In the dialog window that appears you can select the tiers you wish to export(How to select tiers) and specify whether you want to restrict the output to the selected interval.

After clicking OK, you can enter a file name and select an encoding. In addition to TextGrid files in the default encoding for the operating system, ELAN supports Praat TextGrid files with UTF-8 and UTF-16 encoding. Finally click on Save.

WebAnnotation JSON

The preliminary export function File > Export as > WebAnnotation JSON... stores annotations according to the W3C Web Annotation Data Model specifications. This model and format are intended to enable sharing and reuse of annotations across applications and platforms.

Figure 1.46. Export settings and JSON preview

Export settings and JSON preview

The export window offers a few options to customize the output. Apart from the possibility to select the tiers to export and to only export the selected interval, there are a few format specific options which determine which information is included and how it is structured. After changing settings, the Update button applies the settings and updates the preview on the left side of the window. The Export button initiate the actual export to a .json text file.

Alphabetical list of words

Sometimes it can be very useful to have a alphabetical list of (unique) words from one or more tiers. ELAN offers a way to generate such lists. Go to File > Export as > List of Words ... and select the tiers(see How to select tiers) from which you want to extract the words. The annotations of the selected tiers will be tokenized (split into words) using either a default set of delimiters or a user definable set. Check Count occurrences if you want the list to include the number of occurrences for each token. The Include overall totals in the export file option results in some basic overall statistics at the end of the file. The Include frequency percentages in the export option adds another column to the output, containing the percentage of each unique word (or annotation) of the total word count. After selecting tiers (or better, deselecting unwanted tiers) you can click OK and choose a file name. Clicking Save will save the word list.

SMIL clip

ELAN supports export to SMIL[3]-compliant clips. With a suitable player this enables you to view media files and the associated annotations as a subtitled movie.

Export SMIL for Real Player

  1. Select the File > Export As > SMIL > Real Player... menu. This will bring up this dialog box:

    Figure 1.47. Export SMIL Real Player

    Export SMIL Real Player


  2. Select the tiers you want to export (see How to select tiers).
  3. Check Restrict to selected time interval if you only want to export the current selection. Otherwise the whole media file and associated annotations will be exported.
    • Check Recalculate the begin time of the selected annotations to start from zero if you only want the current selection start time to start from zero.
  4. Check Add master media time offset to annotation times to add the annotation times the time offset from the master media that originated from the synchronization of media files (see Synchronizing video files).
  5. Check Minimal duration per subtitle (in ms.) to specify the minimal display duration of a subtitle. For instance, if a annotation is only 0.3 seconds long, but you want to display a subtitle at least 0.5 seconds, enter 500 (ms).
  6. Click on Edit Font and Display settings... button. This will bring up this dialog box:

    Figure 1.48. Change subtitle text settings

    Change subtitle text settings


    • Click on the respective Browse.. button and select the color from the dialog displayed to set the background color and text color of the subtitle text.
    • To set the font of the Text, click on the respective Browse... button and select a font from the font list.
    • Font size and the alignment of the subtitle text can be selected from their respective list.
    • Click Default button to set the default setting.
    • Click on the Apply button to apply the new setting

  7. Choose OK to export the clip.
  8. Click on the suggested file name to change the location where the SMIL clip will be saved.

Export SMIL for Quick Time

Exporting SMIL for Quick time is very much the same as exporting SMIL for real player (see Export SMIL for Real Player). To export SMIL for Quick time, go to File > Export As > QuickTime.... This will bring up a dialog box very similar to export SMIL for Real player . The only extra option which is not available for real player is Merge tiers into one QuickTime text file.If selected, all tiers are merged into one file and if not selected a separate text file will be generated for each tier. It is also possible to set a transparent background for the subtitles. This is done by selecting Transparent background in the dialog (see Figure 1.48, “Change subtitle text settings”) which pops up by clicking the Edit Font and Display Settings... button. Finally click on OK to export.

QuickTime Text

Another format you can export to from ELAN is QuickTime subtitle Text. To do this, go to File > Export As > QuickTime Text.... Select the tiers(see How to select tiers ) you want to be included in the subtitles. Optionally specify the following options:

  • Restrict to selected time interval: restrict the subtitles to the current selection.
    • Recalculate the begin time of the selected annotations to start from zero: recalculates the time of current selection to start from zero
  • Add master media time offset to annotation times: add to the annotation times the time offset from the master media that originated from the synchronization of media files (see Synchronizing video files).
  • Minimal duration per subtitle (in ms.): specify the minimal display duration of a subtitle. For instance, if a annotation is only 0.3 seconds long, but you want to display a subtitle at least 0.5 seconds, enter 500 (ms).
  • Merge tiers into one QuickTime text file: If not selected a separate text file will be generated for each tier.
  • Edit Font and Display Settings... : (see Figure 1.48, “Change subtitle text settings”)
  • Reuse last custom display settings: when ticked the last used custom font and display settings are automatically applied to the exported text

Finally click on OK. By default the subtitles are stored in a QTtext .txt file. If you enter a file name with the extension .xml the subtitles are stored in a TeXML - tx3g formatted XML file (the merge tiers option is ignored in that case).

Subtitle Text

Besides the QuickTime subtitle Text (see QuickTime Text) ELAN can export annotations to there are few other subtitle formats: SubRip (.srt), Spruce (.stl), Timed Text Markup Language(ttml) (.xml) and LRC (.lrc) . Click on File > Export As > Subtitle Text... and select the tiers(see How to select tiers ) you want to include in the subtitle file. Specify whether the subtitles should be restricted to annotations in the selected time interval, whether the time of the selected interval should be recalculated form zero and if the master media time offset should be added to the annotations times. The third option lets you specify the minimal display duration of a subtitle. For instance, if a annotation is only 0.3 seconds long, but you want to display a subtitle at least 0.5 seconds, enter 500 (ms).

Figure 1.49. Export as Subtitles text

Export as Subtitles text


After you have selected tiers and specified the options, click on OK. Enter a file name in the next window and click on Save.

Tiers for recognizers

Tiers for the recognizers are exported in the AVATech tier format. For more information on the AVATech tier format see https://tla.mpi.nl/projects_info/avatech/. Files can be exported as .txt, .csv and xml.

  1. Select File > Export As > Tiers for Recognizer... menu. This will bring up this dialog box:

    Figure 1.50. Tiers for AVATech recognizers

    Tiers for AVATech recognizers


  2. Check Show only root tiers to show only the top level tiers.
  3. Select the tiers you want to export. Keep CTRL pressed and click to select multiple tiers, press Shift and click to select multiple successive tiers.
  4. Check Restrict to selected time interval if you want to export the current selection. Otherwise the whole media file and associated annotations will be exported.
  5. Check new format to output the tiers to a new, more extensive xml format that supports a separate output scheme of overlapping tiers.
  6. Click OK to export the tiers and give a file name, where the tiers can be exported. Also choose the format you want, e.g. txt, csv or xml.

Media clip using script

ELAN supports any command line tool that can extract clips from a video (or audio) file. For that purpose it uses a script file named "clip-media.txt" which can be found in the folder where ELAN is installed. In most cases some configuration needs to be performed in the script file, e.g. which command line tool to use, before clipping can succeed. Therefore ELAN first checks the (see Special ELAN data folder) for the presence of the "clip-media.txt" file, before trying this file in its installation folder. By copying the customized "clip-media.txt" file to the data folder, the changes are accessible to all versions of ELAN.

Mac OS users will have a default execution line in "clip-media.txt" looking like this:

osascript ./scripts/qtp_clip_10_7_export.scpt $in_file $out_file $begin(sec.ms) $end(sec.ms)

Which means that an AppleScript script in the "scripts" folder will be executed when clipping media. There is also a pdf file in the ELAN installation folder to help Mac OS users with editing the syntax.

Windows users can e.g. put a copy of ffmpeg.exe (or ffmbc.exe for clipping mp4 files) in the folder where ELAN is installed (or modify the execution line such that the full path to ffmpeg is included). You can find ffmpeg and ffmbc online.

If you want to use the syntax for ffmpeg, remove the # in front of the line starting with 'ffmpeg.exe -i ......... If you want to use the syntax for ffmbc, remove the # in front of 'ffmbc.exe -vcodec copy....... Make sure the syntax you do not want to use has a # in front of it, this comments the line out.

The syntax for ffmpeg can be: ffmpeg.exe -i $in_file -vcodec copy -acodec copy -ss $begin(sec.ms) -t $duration(sec.ms) $out_file

ffmpeg.exe : the path of the application

$in_file : specifies the input file

$out_file : output file

vcodec copy -acodec copy : copy both the video- and audiocodec

$begin(sec.ms) : specifies the begin time frame of the clip

$duration(sec.ms) : the duration of the clip.

Look in the script file for more explanation and examples. If it is not possible to edit the script file due to file permissions, copy "clip-media.txt" to the Special ELAN data folder (and modify it to use an absolute path to the clipping application).

A few examples for command line tools are:

C:\ffmpeg.exe -i $in_file -vcodec copy -acodec copy -ss $begin(sec.ms) -t $duration(sec.ms) $out_file

C:\ffmbc.exe -vcodec copy -acodec copy -ss $begin(hour:min:sec.ms) -t $duration(hour:min:sec.ms) -i $in_file $out_file

To clip a media file first make a time selection and choose File > Export As > Media Clip using Script.... A dialog will appear in which you can set the file name and the location to save the clipped file to. You can specify more options for clipping in the Preferences dialog, see Editing preferences.

Note

If you have more media files to be clipped, typing a file name with a extension in the 'Save as' dialog will use the same extension for all the files that will be clipped. If you want to use the same extension from the original media file for the clipped files, then don't type an extension with the file name in the 'Save as' dialog which prompts you to set the file name and location for the clipped media files.

Image from ELAN Window

To export an image from the ELAN window (i.e. to make a screenshot):

  1. choose File > Export As > Image from ELAN Window...
  2. Enter a file name and an extension (*.jpg, *.jpeg, *.png or *.bmp)
  3. click on Save.

    Note

    If you are using Windows, it sometimes happens that ELAN’s video window is black on the picture created using this function. This can be solved by temporary disabling the hardware video acceleration:

    1. Right-click on the desktop
    2. choose properties
    3. select the Settings tab
    4. Click on the advanced… button
    5. Select the Troubleshooting tab
    6. move the Hardware Acceleration slider tot None

    Don’t forget to re-enable the hardware acceleration afterwards, because this has a strong effect on the system’s graphical performance.

Filmstrip Image

Figure 1.51. An exported filmstrip image

An exported filmstrip image


To export a Filmstrip Image first select the time segment you want the filmstrip of. Then click File > Export As > Filmstrip Image.... In the dialog window (see Figure 1.52, “Exporting to a filmstrip image”) you can define the width of each video frame, which frames to include and whether ELAN must add a time code in each frame. Moreover, ELAN can add the waveform, with or without a ruler, and specify the height. You can also specify whether the stereo channel should be displayed separately or merged or blended. Click on OK to generate the image. Finally select a destination folder, enter a file name and click on Save.

An example or an exported filmstrip image can be seen in Figure 1.51, “An exported filmstrip image”.

Figure 1.52. Exporting to a filmstrip image

Exporting to a filmstrip image


Annotation Density Plot Image

This option allows to save an image of a graphical representation of the density of annotations on selected tiers. This is the same functionality, with the same customization options, as in View > Annotation Density Plot...(Annotation Density Plot).

Shoebox file

All Shoebox files that were imported into ELAN (see Shoebox file) can be exported back into Shoebox. In this case, the time code information is kept.

To export a file into Shoebox, do the following:

  1. Click on File menu.
  2. Click on Export as > Shoebox file ….

    The Shoebox Export dialog box appears. Make a choice and click on OK to continue.

    Figure 1.53. Shoebox Export dialog window

    Shoebox Export dialog window


    • By selecting Wrap block you can let ELAN wrap a whole block if one of the line in a block is longer than a specified number of character (default is 80 characters).
    • By selecting Add master media time offset to annotation times you can add to the annotation times the time offset from the master media that originated from the synchronization of media files (see Synchronizing video files).
  3. Specify the name and directory of the exported file, e.g.:

    Figure 1.54. Name and directory of exported file

    Name and directory of exported file


  4. Click Save to export the file; otherwise click Cancel to exit the dialog box without exporting the file.

    The file is exported as a *.txt | *.sht | *.tbt file.

    If there already exists a file of the same name, ELAN will ask you whether or not it should overwrite the existing file, e.g.:

    Figure 1.55. File Exists

    File Exists


  5. Open the exported file in Shoebox.

    It contains the following information:

    1. All tiers and annotations.

      Each ELAN parent annotation (including all its referring annotations) corresponds to one Shoebox record. E.g., in the illustration below, the ELAN parent annotation “Ligya-001” corresponds to the Shoebox record “Ligya-001”.

    2. The time code information for each parent annotation.

      Each ELAN parent annotation (i.e., each Shoebox record) contains the additional field markers \ELANBegin and \ELANEnd (i.e., the begin and end time of the parent annotation).

      This time code information allows you to import the Shoebox file back into ELAN, without having to manually re-align the file (see Shoebox file).

    Figure 1.56. ELAN file and exported file

    ELAN file and exported file

Import from

ELAN supports importing file from :

There are also options in ELAN available to import multiple files at once. More details regarding these options can be found here: Multiple file import options

Toolbox file

ELAN supports the import of documents from Toolbox, allowing you to link transcribed and/or interlinearized documents to the time axis of media files. In order to import from Toolbox, you need at least the following two files:

  • the Toolbox file (*.txt, *.sht, *.tbt);
  • the media file(s) (*.mpg, *.mov, *.wav etc.);

Optionally you can use the corresponding Toolbox database type file (*.typ). If this is not available, one has to provide a list with field markers (= tier names).

Note

If you do not know the Toolbox database type file, do the following:

  1. Open the Toolbox *.txt |*.sht |*.tbt file in Toolbox. Make sure it is the active window (click on it to activate it).
  2. Click on Database menu.
  3. Click on Properties …. The Database Type Properties dialog box appears. The name of the database type is displayed in the header, e.g.:

    Figure 1.57. Database type properties dialog window

    Database type properties dialog window


  4. Locate the directory of the database type file (e.g., “texts.typ” in the above illustration). It is probably located in the directory “My Shoebox Settings”.

Importing Toolbox files with a TYP file

To import a Toolbox file into ELAN, do the following:

  1. Click on File > Import > Toolbox File. The Import Toolbox dialog box appears.
  2. Specify the name and directory of the two files, e.g.:

    Figure 1.58. Import Toolbox file

    Import Toolbox file


  3. Like *.eaf documents, the Toolbox file and the media file(s) do not necessarily need to have the same name, and they do not need to be in the same directory (see Basic Information).

    If the Toolbox file contains both aligned (i.e. containing time information) and non-aligned records, the aligned ones will maintain the timing, whereas the location of the non-aligned records will be interpolated automatically.

  4. Click OK to import the file; otherwise click Cancel to exit the dialog box without importing the file.

An ELAN window containing the imported Toolbox file appears.

Importing Toolbox files without a TYP file

Instead of using a Toolbox *.txt|*.sht |*.tbt file, there is also an option in ELAN to define the field markers yourself when importing a Toolbox file.

  1. select the Set field markers and click on the button in the import dialog. The following window appears:

    Figure 1.59. Set Shoebox/Toolbox field markers

    Set Shoebox/Toolbox field markers


  2. Now fill in a field marker as used in the Shoebox/Toolbox *.txt|*.sht |*.tbt file
  3. Optionally select a parent marker (see Basic Information: Annotations, tiers and tier types)
  4. Optionally select a stereotype (symbolic subdivision or association, see Basic Information: Annotations, tiers and tier types)
  5. Choose a character set (Latin-1, SIL IPA or UTF-8) for the tier (only available with Shoebox import! Toolbox charset is UTF-8)
  6. Click on Add.
  7. Repeat step 2-6 for all field markers.
  8. If the selected marker designates a participant, check the Participant Marker checkbox. If you don’t want the selected marker to be imported, tick Exclude from import.
  9. finally choose Close and click on OK in the import Shoebox file dialog

Note

Some markers are already 'built-in' in ELAN and must not need to be set: ELANBegin, ELANParticipant, ELANEnd.

Loading and storing Markers

Once you have manually created a set of field makers, you might want to reuse them later on. ELAN provides support for this:

  • To save a set of field markers, select the Store Markers… button. This will display a save dialog. Enter a file name, and press save.
  • The same way you can open a stored field marker set by clicking on Load Markers…

Figure 1.60. Store markers

Store markers


Connecting the transcription to a media file

Once the import has succeeded, you can add a reference to a media file via the Edit > Linked Files… menu, as described in Changing the links to media files. If the imported Toolbox file was exported from ELAN before, you won’t need to establish the link to the media file(s) again, as in that case the location information is stored in the file.

About the import process

ELAN imports Toolbox files according to the following conventions:

  1. The Toolbox field markers are imported as ELAN tiers. The tier label is identical to that of the field marker, except for the added extension @‘Speaker-ID’.

    This addition is necessary because ELAN and Toolbox differ in how they code information about multiple speakers:

    • In ELAN, each speaker is coded on a separate tier.
    • In Toolbox, all speakers are coded using the same field, and their identity is specified in a separate field.

    Figure 1.61. Toolbox field markers and ELAN tiers

    Toolbox field markers and ELAN tiers


    When importing texts by multiple speakers, ELAN splits each Toolbox field into several ELAN tiers (one for each speaker) and adds the speaker-ID to the tier label.

    If speaker information is not specified in the Toolbox file, the extension @unknown is added.

    The following screenshot illustrates how ELAN treats texts by multiple speakers:

    Figure 1.62. Multiple speakers in ELAN

    Multiple speakers in ELAN


Note that ELAN can only read speaker information if:

  • A marker is defined as a Participant marker in the Set field marker dialog (see Importing Toolbox files without a TYP file above), or if:
  • It is coded in a Toolbox field labelled \EUDICOp or \ELANParticipant (see illustration above). If this field is not present, or if speaker information is coded in a different field, ELAN will assume that there is only one speaker. I.e., if you have multiple speakers and if you want ELAN to assign them to separate tiers, do the following:
    1. For every Toolbox record, add the field marker \EUDICOp.
    2. For every Toolbox record, enter the relevant speaker-ID into this field.

Note

When the file is exported back to Toolbox (see Toolbox file(UTF-8)), the extension @‘Speaker-ID’ is automatically dropped from the field marker, and the Toolbox records are sorted according to their record marker (e.g., in the above illustration, “test 001” is sorted before “test 002” etc.)

  1. Based on the information contained in the Toolbox database type file, the tiers are brought into a hierarchical relationship and are assigned to tier types (see Basic Information: Annotations, tiers and tier types for details of tier hierarchies and tier types). For every tier name a corresponding tier type with the same name is created. All of these tier types are connected with a stereotype in such a way that it fits with the original Toolbox structure.
    • The Toolbox record marker is assigned to the stereotype None, i.e., it is an independent, time-alignable parent tier.
    • The transcription and parsing fields of Toolbox are assigned to the stereotype Symbolic Subdivision, i.e., they are referring tiers that can be subdivided into smaller units.
    • All other fields are assigned to the stereotype Symbolic Association, i.e., they are referring tiers that cannot be subdivided into smaller units.

If you define the markers yourself, then there also is the possibility to choose the Time Subdivision stereotype. For example:

Figure 1.63. Time Subdivision

Time Subdivision


  1. If you import a Shoebox record, all SIL IPA characters are converted into Unicode characters during import. If you export the file back into Shoebox (see Shoebox file), the Unicode characters will be converted back into SIL IPA characters. This does not apply to Toolbox records.
  2. Initially, unless it had the time code information, the imported Toolbox file does not contain information about timing. Instead, ELAN automatically assigns each Toolbox record to a three second time interval, as in the following illustration:

    Figure 1.64. Fixed time intervals

    Fixed time intervals


The time alignment has to be done manually for each Toolbox record. Do the following:

  1. Activate the Bulldozer mode: Click on Options > Propagate Time Changes > Bulldozer Mode (see Activating and deactivating the Bulldozer mode or Shift mode for the three available modes).

    Note

    If you do not activate the Bulldozer mode, you will inadvertently overwrite and thereby delete existing annotations. Make sure that Bulldozer Mode is enabled in the Options > Propagate Time Changes menu.

  2. Click on the first annotation on the parent tier (i.e., the first Shoebox record). It appears in a dark blue frame.
  3. Modify the boundaries of that annotation, so that they are aligned with the correct time interval (see Changing the boundaries of an existing selection and annotation for ways of modifying boundaries).
  4. Press CTRL+ENTER to apply the new time interval.

    The parent annotation (together with all its referring annotations) is assigned to the new time interval. All other parent annotations are moved to the right.

  5. Repeat steps 2 to 4 for each parent annotation.

The following screenshot illustrates steps 1 to 4:

Figure 1.65. Time alignment

Time alignment


After you have done the time-alignment, you can export the file back to Toolbox – in this case, the time code information will be kept (see Toolbox file(UTF-8)). If you then re-import the file back into ELAN, ELAN automatically assigns the Shoebox records to their correct time intervals.

An imported Toolbox file can be saved as an ELAN file (see Re-open recently accessed files), exported back into Shoebox (see Toolbox file(UTF-8)), or exported as a tab-delimited text (see Tab-delimited text file).

Fieldworks Language Explorer (FLEx) file

ELAN can import documents from the SIL Fieldworks Language Explorer (FLEx). This involves a few steps:

  1. Click File > Import > FLEx File.... Select the .flextext file and relevant media files by clicking the ...-buttons.
  2. In the import window select the .flextext file exported from FLEx. Optionally also add media files here (if not already in your .flextext file). There are options to exclude the interlinear-text and paragraph elements from the import, as well as the option to import participant information. When as smallest time-alignable element the word element is selected, the time-alignment for that level will be lost when exported again to FLEx. In .flextext time alignment is stored on the phrase level.
  3. It is possible to have tier types created simply for all major elements (phrase, word, morph etc.) or, more fine-grained, for each combination of major element plus item type up to a combination of major element, the type and the language.
  4. Finally, set a duration per phrase element in milliseconds. This has to be set if the FLEx export files do not contain timestamps. When importing a FLEx file that was edited in ELAN before and exported as a .flextext file, time duration information has already been stored in the file.

Figure 1.66. Import FLEx file

Import FLEx file

Figure 1.67. FLEx to ELAN structure

FLEx to ELAN structure

The tier structure created after import in ELAN is roughly like in the example above. The mapping of the FLEx structure onto ELAN tiers follows the schema: <Speaker>_<element>-<item-type>-<language> Where the Speaker prefix is a generic label (A, B, C, ...).

FLEx tiers and their representation in .flextext:

Word <word> <item type=”txt”>
Morphemes <morph> <item type=”txt”>
Lex. Entries <morph> <item type=”cf”>
  <morph> <item type=”hn”>
Lex. Gloss <morph> <item type=”gls”>
Lex. Gram. <morph> <item type=”msa”>
Word Gloss <word> <item type="gls">
Word Cat. <word> <item type=”pos”>

Note

On the third-party resources page of ELAN (https://tla.mpi.nl/tools/tla-tools/elan/thirdparty/ ), you can find a teaching-set which covers the aspects of importing from FLEx to ELAN and back to FLEx.

CHAT file

It is possible to import CHAT files (used in e.g. the Childes project) in ELAN:

  1. Select File > Import > CHAT File …
  2. Select the Chat file
  3. Click on Open

Some remarks about this import feature:

  • supported are old CHAT files and CHAT-UTF8, not XML CHAT
  • existing media alignment in %snd tiers is maintained in ELAN:
    • when no media alignment is present at all, each CHAT utterance gets a default interval of 1 second assigned
    • when partial media alignment is present, the time interval is equally distributed over preceding unaligned utterances
    • overlapping utterances of the same participant are corrected as good as possible
    • CHAT dependent tier names are mapped to ELAN Tier Types
    • ELAN tier names are either CHAT participant labels or CHAT tier names, followed by '@participantName'

Remaining issues:

  • '<' and '>' characters in CHAT cause parsing errors when the imported file is saved as EAF file

Transcriber files

The feature to import Transcriber annotation files into ELAN works as follows:

  1. Choose File > Import > Transcriber File …
  2. Select the transcriber file (*.trs) and click on Open
  3. If the associated sound file cannot be found, a dialog will be shown asking you to locate it. When this request is cancelled, one can choose to open the annotation file without the sound, or to stop the whole import process.

The transcriber tiers will be mapped on the ELAN equivalents:

CSV / Tab-delimited Text files

A CSV (Comma Separated Values) or Tab-delimited Text (or Tab Separated Values) file is a text file in which one can identify rows and columns. Rows are represented by the lines in the file and the columns are created by separating the values on each line by a specific character, like a comma or a tab. CSV or Tab-delimited Text files can be compared to spreadsheets like the ones in Microsoft Excel in that they also have rows and columns. Note that .csv files can be created by Excel.

Take a look at Figure 1.68, “Tab-delimited Text”. The first row represents the event of a person saying 'so from here'. The first value (as well as the first column of the complete file) represents the tier name, the second and third represent begin time in different formats, the fourth and fifth represent the end time, the sixth an seventh represent the duration and the last value represents the annotation.

Figure 1.68. Tab-delimited Text

Tab-delimited Text


You are able to import CSV or Tab-delimited Text files in ELAN: File > Import > CSV / Tab-delimited Text File.... In the dialog window browse to and select a file that contains CSV or Tab-delimited data and click Open.

The second dialog window contains two sections (see Figure 1.69, “Import CSV / Tab-delimited Text”). The upper section shows a sample table containing data from the selected file. Both rows and columns are numbered. The lower section enables you to specify which columns to include and what data type they represent. This means that the format of the files is flexible: it is not prescribed what data is expected nor how it is formatted. The numbers of the columns in the Import Options section correspond to the numbers of the columns in the sample table. The data types you can select are:

  • Annotation
  • Tier
  • Begin time
  • End time
  • Duration

Select at least one column with data type 'Annotation'. If you select a column for begin time, end time and duration, the latter will be ignored in the import process.

Figure 1.69. Import CSV / Tab-delimited Text

Import CSV / Tab-delimited Text

The option Specify first row of data enables you to exclude a header by excluding the first few lines. The option Specify delimiter lets you specify the delimiter if ELAN did not guess the correct delimiter. The delimiters supported by ELAN are comma, tab, colon, semi-colon and the vertical line (vertical bar).

If you enable the option Default annotation duration ELAN creates all annotations from the selected file with durations equal to the number of milliseconds specified. This option works only if there is no time data or only the begin or end times.

Default annotation duration will create annotation units with the specified duration.

Skip empty cells will leave out the cells in the csv that are empty. Different tiers can be imported with different segmentations with this option.

Finally click OK to import the data. If a transcription document was open when starting the import, the imported tiers and annotations will be added to the already open document, otherwise a new transcription document is created with the imported annotations as its contents.

Another example

To demonstrate that the format of the imported file can be flexible, take a look at the following tab-delimited text:

Figure 1.70. Tab-delimited text, different orientation

Tab-delimited text, different orientation


In this example each column represents a tier with the tier names in the first row and the annotation in the other rows. This file can be imported by selecting the following import options:

Figure 1.71. Import CSV / Tab-delimited Text

Import CSV / Tab-delimited Text


Note that the Specify first row of data option is set to 2. As a consequence ELAN starts importing annotations from row 2 instead of row 1. Furthermore, ELAN tries to extract tier names from the first line of the file if the column they are part of is specified as 'annotation'. This results in this example in two tiers: K-Spch and W-Spch.

To merge a CSV file with an existing *.eaf file, open the *.eaf file first and then choose Import CSV/Tab-delimited Text File. For information on merging a CSV file that has been imorted into a new document with an existing *.eaf file, please seeMerging transcriptions.

Subtitle / Audacity Label file

It is possible to import subtitles that are stored in the SubRip *.srt format: File > Import > Subtitle / Audacity Label File.... HTML and similar formatting tags are filtered out and multiple speakers are merged into one. The correct encoding of the file has to be specified in the import window.

Audacity Label files are a specific kind of tab-delimited text (*.txt) files. They can be imported here without the configuration step that is part of the general Import CSV/Tab-delimited Text File import.

If this import is started when a document is already open, the imported contents is added to that transcription. Otherwise a new transcription document is created.

Praat TextGrid file

ELAN offers the possibility to import a Praat TextGrid file: click on File > Import > Praat TextGrid File.... In the dialog window that now appears, you can browse to the file you wish to import. You are also able to include Praat PointTiers. When selecting this option, specify the default PointTiers annotation duration in milliseconds. Finally, check Skip empty intervals / annotations if you want to do so.

If there is already a annotation document opened in ELAN, the imported TextGrid is added to the document in one or more new tiers. If there is no annotation document opened, a new document consisting of the TextGrid data is generated.

In addition to TextGrid files in the default encoding for the operating system, ELAN supports Praat TextGrid files with UTF-8 and UTF-16 encoding.

WebAnnotation JSON file

It is possible to import a WebAnnotation JSON file via File > Import > WebAnnotation JSON File..., the file extension is .json or .jsonld. There are no configuration options. The contents of the file should comply with the W3C Web Annotation Data Model specifications, even though the import function only supports a subset of those specifications (those elements that map quite naturally to ELAN elements).

Tiers from recognizer

Importing Tiers from recognizers will import the tiers in a new file if there is no file currently open in elan. But if a file is open, the tiers will be in the currently open file. To import the tiers from recognizers, go to File > Import > Tiers from Recognizer.... Selecting this option, first will prompt for the import file. If there is no file is open, the tiers are directly imported to the new file. But if a file is already open, then a 'Create tiers from segments' dialog appears. For more information about this dialog see Figure 2.14, “Silence Recognizer”.

Shoebox file

Importing a document from Shoebox is very much the same as importing a document from Toolbox (see Toolbox file). As with the Toolbox import, information about the tier relations can be provided by means of a .typ file or by using a marker file.

When reconstructing the vertical alignment of words on interlinearized markers, the position is recalculated based on the number of bytes per character. But in some files this leads to incorrect alignment, therefore this recalculation can be turned off by unchecking Correct alignment based on the number of bytes per character. This import also tries to take non-spacing characters into account.



[2] Synpathy is a tool for annotating, analyzing, and graphically editing the syntactical structure of sentences (e.g. Linguistically annotated text corpora), developed at the Max Planck Institute for Psycholinguistics. The application is based on the SyntaxViewer from the TIGER search project developed by the IMS (Institute für Maschinelle Sprachverarbeitung, University of Stuttgart).

[3] For a description of this standard and players see http://www.w3.org/AudioVideo/