Note: this recognizer runs on a web server, the video file will be uploaded using HTTP.
How to use the Hand Head Tracker from within ELAN
The Hand Head Tracking recognizer by the Fraunhofer Heinrich Hertz
Institute HHI allows you to analyze video recordings to support
sign and gesture annotation. It tracks hand and head movements and
outputs annotations about basic events.
- Input: A video recording in a FFMPEG supported format
- most common video file formats are supported. Note that videos
in high resolution take longer to process, while low resolution
files can have harder to recognize features. Files should have a
stable background and view: Avoid movements in the background
and camera movements.
- Input tiers: The recognizer supports skin tone and
range information input, which is often better than letting the
recognizer use automatic detection. A free Windows desktop tool
is available to let you pick the right range for your
video.
- Settings:
- Change threshold influences how sensitive the
recognizer is for changes in the picture
- Logging level lets you pick the amount of log
messages to produce: Verbose messages can slow down
processing but can help to debug problems
- Speed threshold selects how large hand movements
have to be to be annotated as such
- Background image selects whether the background
should be excluded from analysis - recommended
- Use rest position enables a non-background area
around the lap for hands resting between movements
- Video resolution allows you to influence the size
of the generated output video, as it is often
not necessary to have that in full resolution
- Use single cluster influences how many movement
clusters in the video should trigger processing
- Output: This recognizer creates a single bundle
of tiers, annotating various events in the input video. It
can also create an output video (*.mp4) to show the input
as 'seen' by the recognizer, with hand / head highlighting.
AVATecH and AUVIS compatible recognizers have the following
categories of settings, input and output elements:
- input media: ELAN automatically uses the first suitable media
file of your current annotation session, but you can change
that to other supported files belonging to the session. Very
few recognizers expect multiple input media files or extra
input files in 'timeseries' or recognizer-specific formats.
- input tiers: Some recognizers need input in the form of an
annotation tier, for example to select timespans of interest.
For some recognizers, the input is expected to be the output
of another recognizer. This gives you a chance to edit and
correct data - often simply tiers - between the two steps.
- numerical input: Recognizers can be configurable by
numerical 'knobs'. ELAN can show those as slider or field.
Recognizers often work well enough with defaults already.
- choice input: Recognizers can give you the option
to select settings from a pre-defined list. An example can
be 'verbose/normal/silent' messages or 'high/low' sensitivity.
ELAN shows drop down selectors here. In special cases, a
recognizer can also have 'any text' configuration items.
- output: Recognizers often produce one or more annotation
tiers. ELAN will offer to add those to your annotation
session as new tiers. It is also possible for recognizers
to output timeseries (which ELAN can show as curves) or
even audio, video or other files. Most recognizers only
produce zero or more tiers (plus log messages) as output.
It is often possible to selectively skip some output steps.
- log: You can open a window showing general messages from
the recognizer, tagged by type (e.g. DEBUG, INFO, WARN,
ERROR, RESULT or PROGRESS). Messages of higher priority
also update the processing status display, so they can
be seen directly without having to review the log text.
- basic or advanced recognizer settings: ELAN gives you
the choice to either hide or show 'advanced' settings. Default
values will be used for those settings which are hidden.
Your default ELAN configuration invokes a
CLAM
REST
web service wrapper on catalog.clarin.eu to have your files analyzed.
In other words, your media files and, if applicable, input tiers will
be uploaded for processing and ELAN will process the downloaded (tier
or other) results as if you had done the processing locally. For use
in situations where a web service can not be used (too large files or
no internet available) you can also request a copy of the recognizer
for local installation on Linux or Windows, protected by USB dongle.
For this and for general support with the use of this recognizer,
please contact auvis@mpi.nl or use
the ELAN and AUVIS forums on the website of
The Language Archive.
CLAM, ELAN and the client-side recognizer proxy are free open source
software under the
GNU
General Public License - however, some of the recognizers can be
propietary closed source software. Licenses for academic use are
available on request. Use of the web services is free at the moment,
but may be limited to the academic community if it becomes necessary.