Abstracts are invited for a full day workshop on sign language resources, to take place following the 2018 LREC conference. Recent technological developments allow sign language researchers to create relatively large video corpora of sign language use that were unimaginable twenty years ago. Several national projects are currently underway, and more are planned. This workshop aims to share experiences from current and past efforts. What are the problems that were encountered and the solutions created, what are the linguistic decisions taken?
For 2018, the hot topic is “Involving the Language Community”: For any language technology developing, it is important to involve the language community as the potential user group of that technology as early as possible. This is even more true for minority languages such as sign languages.
Most current sign language corpus projects have collected their own data by either inviting language community members to their video studios or by visiting community members in their living rooms, bringing their video cameras. Recent approaches try to profit from the internet becoming a central piece of most citizens’ life, by using pre-existing sign language data on the web, or by using social media channels to get people involved, either for the main data collection or for second-wave collections to fill gaps left or to ask for votes. How do these approaches combine, and what are the implications for the corpora built? As both annotation and translation of sign language data are most time-consuming and thus expensive, it is tempting to get volunteers from the language community involved in these tasks as well. How far do you get with such crowd or community sourcing approaches? How can a meaningful and economical quality control be established? And what is the return on investment for the community?
Many corpus projects follow an evolutionary approach, publishing data in phases over the course of the project. How can these projects profit from user feedback on data already published, what workflows are needed to deal with feedback in a timely manner? How do we establish “virtual focus groups” to discuss upcoming sign language technology user interfaces (such as avatar technology) with interested members of the language community?
Aside from the “hot topic”, the workshop will continue to cover progress on more general issues in research on sign language corpora and tools.
We invite abstracts for 20-minute papers or posters (with or without demonstrations) on the following topics:
INVOLVING THE LANGUAGE COMMUNITY
“Internet as a Corpus” for sign languages
Online voting systems for signers
Crowd and community sourcing for corpus work
Virtual focus groups
Future sign language technology user interfaces such as avatar technology
Ethical and legal aspects
GENERAL ISSUES ON SIGN LANGUAGE CORPORA AND TOOLS
Avatar technology as a tool in sign language corpora and corpus data feeding into advances in avatar technology
Experiences in building sign language corpora
Elicitation methodology appropriate for corpus collection
Proposals for standards for linguistic annotation or for metadata descriptions
Experiences from linguistic research using corpora
Use of (parallel) corpora and lexicons in translation studies and machine translation
Language documentation and long-term accessibility for sign language data
Video compression and streaming for sign language
Annotation and Visualization Tools
Linking corpora and lexicons and integrated presentation of corpus and dictionary contents
Computer recognition of sign language and steps towards automatic annotation
Sign language corpus mining
In the tradition of LREC, oral/signed presentations and poster presentations (with or without demonstrations) have equal status, and authors are encouraged to suggest the presentation format best suited to communicate their ideas. Papers (4, 6 or 8 pages) of both oral/signed presentations and poster presentations of this workshop will be published as workshop proceedings published on the conference website.
Please submit your abstract through the LREC START system (URL tbc) not later than Jan 10th, 2018, indicating whether you prefer an oral/signed or a poster presentation. In the latter case, please also indicate whether you plan to combine the poster with a demo.
Please note that the submission deadline for the abstracts as well as for the paper submissions are substantially earlier than for previous LREC workshops. The earlier schedule is intended to give participants enough time for applying for travel funding.
IDENTIFY, DESCRIBE AND SHARE YOUR LRS!
Describing your LRs in the LRE Map is now a normal practice in the submission procedure of LREC (introduced in 2010 and adopted by other conferences). To continue the efforts initiated at LREC 2014 about “Sharing LRs” (data, tools, web-services, etc.), authors will have the possibility, when submitting a paper, to upload LRs in a special LREC repository. This effort of sharing LRs, linked to the LRE Map for their description, may become a new “regular” feature for conferences in our field, thus contributing to creating a common repository where everyone can deposit and share data.
As scientific work requires accurate citations of referenced work so as to allow the community to understand the whole context and also replicate the experiments conducted by other researchers, LREC 2018 endorses the need to uniquely Identify LRs through the use of the International Standard Language Resource Number (ISLRN, www.islrn.org), a Persistent Unique Identifier to be assigned to each Language Resource. The assignment of ISLRNs to LRs cited in LREC papers will be offered at submission time.