When: | June 14 2022, 10:00-16:00 CEST |
Where: | Zoom |
Organisation: |
• DG CNECT • The European Language Resource Coordination (ELRC) consortium |
Please read ELRC Data Protection Notice
SMART 2019/1083 (CEF Automated Translation, Contracting Authority DG CNECT)
Live virtual workshop (Zoom), 14 June 2022
About the Workshop:
A language model (LM) is a tool for incorporating information that is useful for understanding a language, such as its vocabulary and how it expresses meaning. Using deep learning, which is the state-of-the-art technology in AI, academic or commercial organisations with heavy computing infrastructure construct LMs from large amounts of data (text or speech). Based on these LMs, other organisations can train additional models for their specific applications (e.g. automated translation, summarisation, dialogue interaction, speech recognition) or domains. The process of “specialising” a large model (the latter is called the pre-trained LM) requires much less data and computing power. Therefore, this type of specialisation is having a large impact on the field of natural language processing.
This online workshop, organised by DG CNECT’s Multilingualism sector, will inform staff of EU institutions and Member State public administrations on various aspects of pre-trained LMs:
- how to make use of LMs available from repositories;
- how to specialise multilingual LMs, for instance for automated translation;
- how to leverage LMs in specific use cases, within public administrations and industry;
- how to take into account legal aspects of LMs.
The event is part of a series of technical workshops in the ELRC project (SMART 2019/1083, http://lr-coordination.eu), that supports the continued development of the eTranslation system and a wider deployment of the DG's services in terms of language resources and tools.
AGENDA
10:00 |
Welcome by Tom Vanallemeersch (SMART 2019/0183 Project representative) |
10:10 |
Georg Rehm (DFKI, Berlin): Large Language Models and European Language Equality – Where do we stand and what do we need to do? |
10:50 |
Nils Reimers (Hugging Face, Frankfurt): Cross-Lingual Semantic Search |
11:20 |
Alexandra Chronopoulou (Ludwig Maximilian University, Munich): Improving Multilingual Machine Translation with Language-family Adapters |
11:50 |
Denis Jouvet (MULTISPEECH, Inria/LORIA, Nancy): Language Models for Speech Recognition |
12:20 | Discussion |
12:35 | Lunch |
13:50 |
Love Börjeson (KBLab, National Library of Sweden): Large-scale, Open-access AI Models for Swedish |
14:20 |
Jakub Zavrel (Zeta Alpha, Amsterdam): A New Generation of Neural Search and Knowledge Discovery Tools |
14:50 |
Khalid Choukri, Mickael Rigault (ELDA, Paris): Legal Aspects of Language Models |
15:20 | Discussion |
15:50 | Conclusion |
16:00 | End |