Michael Filhol


2023

pdf bib
Example-Based Machine Translation from Textto a Hierarchical Representation of Sign Language
Elise Bertin-Lemée | Annelies Braffort | Camille Challant | Claire Danet | Michael Filhol
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

This article presents an original method for Text-to-Sign Translation. It compensates data scarcity using a domain-specific parallel corpus of alignments between text and hierarchical formal descriptions of Sign Language videos. Based on the detection of similarities present in the source text, the proposed algorithm recursively exploits matches and substitutions of aligned segments to build multiple candidate translations for a novel statement. This helps preserving Sign Language structures as much as possible before falling back on literal translations too quickly, in a generative way. The resulting translations are in the form of AZee expressions, designed to be used as input to avatar synthesis systems. We present a test set tailored to showcase its potential for expressiveness and generation of idiomatic target language, and observed limitations. This work finally opens prospects on how to evaluate this kind of translation.

pdf bib
Une grammaire formelle pour les langues des signes basée sur AZee : une proposition établie sur une étude de corpus
Camille Challant | Michael Filhol
Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 2 : travaux de recherche originaux -- articles courts

Cet article propose de premières réflexions quant à l’élaboration d’une grammaire formelle pour les langues des signes, basée sur l’approche AZee. Nous avons mené une étude statistique sur un corpus d’expressions AZee, qui décrivent des discours en langue des signes française. Cela nous permet d’entrevoir des contraintes sur ces expressions, qui reflètent plus généralement les contraintes de la langue des signes française. Nous présentons quelques contraintes et positionnons théoriquement notre ébauche de grammaire au sein des différentes grammaires formelles existantes.

pdf bib
Traduction à base d’exemples du texte vers une représentation hiérarchique de la langue des signes
Elise Bertin-Lemée | Annelies Braffort | Camille Challant | Claire Danet | Michael Filhol
Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 4 : articles déjà soumis ou acceptés en conférence internationale

Cet article présente une expérimentation de traduction automatique de texte vers la langue des signes (LS). Comme nous ne disposons pas de corpus aligné de grande taille, nous avons exploré une approche à base d’exemples, utilisant AZee, une représentation intermédiaire du discours en LS sous la forme d’expressions hiérarchisées

pdf bib
Éditeur logiciel pour une représentation graphique de la langue des signes française
Michael Filhol | Thomas Von Ascheberg
Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), volume 5 : démonstrations

Démonstration d’un logiciel d’édition d’AZVD, un formalisme graphique pour la langue des signes française. Basé sur des productions spontanées de locuteurs, AZVD est conçu pour maximiser son adoptabilité par la communauté signante.

2022

pdf bib
A First Corpus of AZee Discourse Expressions
Camille Challant | Michael Filhol
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper presents a corpus of AZee discourse expressions, i.e. expressions which formally describe Sign Language utterances of any length using the AZee approach and language. The construction of this corpus had two main goals: a first reference corpus for AZee, and a test of its coverage on a significant sample of real-life utterances. We worked on productions from an existing corpus, namely the “40 breves”, containing an hour of French Sign Language. We wrote the corresponding AZee discourse expressions for the entire video content, i.e. expressions capturing the forms produced by the signers and their associated meaning by combining known production rules, a basic building block for these expressions. These are made available as a version 2 extension of the “40 breves”. We explain the way in which these expressions can be built, present the resulting corpus and set of production rules used, and perform first measurements on it. We also propose an evaluation of our corpus: for one hour of discourse, AZee allows to describe 94% of it, while ongoing studies are increasing this coverage. This corpus offers a lot of future prospects, for instance concerning synthesis with virtual signers, machine translation or formal grammars for Sign Language.

pdf bib
Rosetta-LSF: an Aligned Corpus of French Sign Language and French for Text-to-Sign Translation
Elise Bertin-Lemée | Annelies Braffort | Camille Challant | Claire Danet | Boris Dauriac | Michael Filhol | Emmanuella Martinod | Jérémie Segouat
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This article presents a new French Sign Language (LSF) corpus called “Rosetta-LSF”. It was created to support future studies on the automatic translation of written French into LSF, rendered through the animation of a virtual signer. An overview of the field highlights the importance of a quality representation of LSF. In order to obtain quality animations understandable by signers, it must surpass the simple “gloss transcription” of the LSF lexical units to use in the discourse. To achieve this, we designed a corpus composed of four types of aligned data, and evaluated its usability. These are: news headlines in French, translations of these headlines into LSF in the form of videos showing animations of a virtual signer, gloss annotations of the “traditional” type—although including additional information on the context in which each gestural unit is performed as well as their potential for adaptation to another context—and AZee representations of the videos, i.e. formal expressions capturing the necessary and sufficient linguistic information. This article describes this data, exhibiting an example from the corpus. It is available online for public research.

pdf bib
Representation and Synthesis of Geometric Relocations
Michael Filhol | John McDonald
Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources

One of the key features of signed discourse is the geometric placements of gestural units in signing space. Signers use the geometry of signing space to describe the placements and forms of objects and also use it to contrast participants or locales in a story. Depending on the specific functions of the placement in the discourse, features such as geometric precision, gaze redirection and timing will all differ. A signing avatar must capture these differences to sign such discourse naturally. This paper builds on prior work that animated geometric depictions to enable a signing avatar to more naturally use signing space for opposing participants and concepts in discourse. Building from a structured linguistic description of a signed newscast, they system automatically synthesizes animation that correctly utilizes signing space to lay out the opposing locales in the report. The efficacy of the approach is demonstrated through comparisons of the avatar’s motion with the source signing.

pdf bib
Two New AZee Production Rules Refining Multiplicity in French Sign Language
Emmanuella Martinod | Claire Danet | Michael Filhol
Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources

This paper is a contribution to sign language (SL) modeling. We focus on the hitherto imprecise notion of “Multiplicity”, assumed to express plurality in French Sign Language (LSF), using AZee approach. AZee is a linguistic and formal approach to modeling LSF. It takes into account the linguistic properties and specificities of LSF while respecting constraints linked to a modeling process. We present the methodology to extract AZee production rules. Based on the analysis of strong form-meaning associations in SL data (elicited image descriptions and short news), we identified two production rules structuring the expression of multiplicity in LSF. We explain how these newly extracted production rules are different from existing ones. Our goal is to refine the AZee approach to allow the coverage of a growing part of LSF. This work could lead to an improvement in SL synthesis and SL automatic translation.

pdf bib
Multi-track Bottom-Up Synthesis from Non-Flattened AZee Scores
Paritosh Sharma | Michael Filhol
Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives

We present an algorithm to improve the pre-existing bottom-up animation system for AZee descriptions to synthesize sign language utterances. Our algorithm allows us to synthesize AZee descriptions by preserving the dynamics of underlying blocks. This bottom-up approach aims to deliver procedurally generated animations capable of generating any sign language utterance if an equivalent AZee description exists. The proposed algorithm is built upon the modules of an open-source animation toolkit and takes advantage of the integrated inverse kinematics solver and a non-linear editor.

2021

pdf bib
The Myth of Signing Avatars
John C. McDonald | Rosalee Wolfe | Eleni Efthimiou | Evita Fontinea | Frankie Picron | Davy Van Landuyt | Tina Sioen | Annelies Braffort | Michael Filhol | Sarah Ebling | Thomas Hanke | Verena Krausneker
Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL)

Development of automatic translation between signed and spoken languages has lagged behind the development of automatic translation between spoken languages, but it is a common misperception that extending machine translation techniques to include signed languages should be a straightforward process. A contributing factor is the lack of an acceptable method for displaying sign language apart from interpreters on video. This position paper examines the challenges of displaying a signed language as a target in automatic translation, analyses the underlying causes and suggests strategies to develop display technologies that are acceptable to sign language communities.

2020

pdf bib
Alignment Data base for a Sign Language Concordancer
Marion Kaczmarek | Michael Filhol
Proceedings of the Twelfth Language Resources and Evaluation Conference

This article deals with elaborating a data base of alignments of parallel Franch-LSF segments. This data base is meant to be searched using a concordancer which we are also designing. We wish to equip Sign Language translators with tools similar to those used in text-to-text translation. To do so, we need language resources to feed them. Already existing Sign Language corpora can be found, but do not match our needs: working around a Sign Language concordancer, the corpus must be a parallel one and provide various examples of vocabulary and grammatical construction. We started with a parallel corpus of 40 short news and 120 SL videos , which we aligned manually by segments of various length. We described the methodology we used, how we define our segments and alignments. The last part concerns how we hope to allow the data base to keep growing in a near future.

pdf bib
Elicitation and Corpus of Spontaneous Sign Language Discourse Representation Diagrams
Michael Filhol
Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives

While Sign Languages have no standard written form, many signers do capture their language in some form of spontaneous graphical form. We list a few use cases (discourse preparation, deverbalising for translation, etc.) and give examples of diagrams. After hypothesising that they contain regular patterns of significant value, we propose to build a corpus of such productions. The main contribution of this paper is the specification of the elicitation protocol, explaining the variables that are likely to affect the diagrams collected. We conclude with a report on the current state of a collection following this protocol, and a few observations on the collected contents. A first prospect is the standardisation of a scheme to represent SL discourse in a way that would make them sharable. A subsequent longer-term prospect is for this scheme to be owned by users and with time be shaped into a script for their language.

pdf bib
The Synthesis of Complex Shape Deployments in Sign Language
Michael Filhol | John C. McDonald
Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives

Proform constructs such as classifier predicates and size and shape specifiers are essential elements of Sign Language communication, but have remained a challenge for synthesis due to their highly variable nature. In contrast to frozen signs, which may be pre-animated or recorded, their variability necessitates a new approach both to their linguistic description and to their synthesis in animation. Though the specification and animation of classifier predicates was covered in previous works, size and shape specifiers have to this date remain unaddressed. This paper presents an efficient method for linguistically describing such specifiers using a small number of rules that cover a large range of possible constructs. It continues to show that with a small number of services in a signing avatar, these descriptions can be synthesized in a natural way that captures the essential gestural actions while also including the subtleties of human motion that make the signing legible.

pdf bib
Use Cases for a Sign Language Concordancer
Marion Kaczmarek | Michael Filhol
Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives

This article treats about a Sign Language concordancer. In the past years, the need for content translated into Sign Language has been growing, and is still growing nowadays. Yet, unlike their text-to-text counterparts, Sign Language translators are not equipped with computer-assisted translation software. As we aim to provide them with such software, we explore the possibilities offered by a first tool: a Sign Language concordancer. It includes designing an alignments database as well as a search function to browse it. Testing sessions with professionals highlight relevant use cases for their professional practices. It can either comfort the translator when the results are identical, or show the importance of context when the results are different for a same expression. This concordancer is available online, and aim to be a collaborative tool. Though our current database is small, we hope for translators to invest themselves and help us to keep it expanding.

2018

pdf bib
Elicitation protocol and material for a corpus of long prepared monologues in Sign Language
Michael Filhol | Mohamed Nassime Hadjadj
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Modeling French Sign Language: a proposal for a semantically compositional system
Mohamed Nassime Hadjadj | Michael Filhol | Annelies Braffort
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
Description de la juxtaposition en Langue des Signes Française à partir d’une grammaire récursive (The present communication tackles formal grammar developpement of French Sign Language (LSF))
Mohamed Nassime Hadjadj | Michael Filhol
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 2 : TALN (Posters)

La présente communication s’inscrit dans le cadre du développement d’une grammaire formelle pour la langue des signes française (LSF). Générer automatiquement des énoncés en LSF implique la définition de certaines règles de production pour synchroniser les différents articulateurs du corps, signes, mouvements, etc. Cet article présente dans sa première partie notre méthodologie pour définir des règles de production à partir d’une étude de corpus. Dans la deuxième partie nous présenterons notre étude qui portera sur deux règles de production pour juxtaposer quelques types de structures en LSF. Nous finissons par une discussion sur la nature et l’apport de notre démarche par rapport aux approches existantes.

2014

pdf bib
Non-linear recursive grammar for Sign languages (Grammaire récursive non linéaire pour les langues des signes) [in French]
Michael Filhol
Proceedings of TALN 2014 (Volume 2: Short Papers)

2012

pdf bib
Semi-Automatic Sign Language Corpora Annotation using Lexical Representations of Signs
Matilde Gonzalez | Michael Filhol | Christophe Collet
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Nowadays many researches focus on the automatic recognition of sign language. High recognition rates are achieved using lot of training data. This data is, generally, collected by manual annotating SL video corpus. However this is time consuming and the results depend on the annotators knowledge. In this work we intend to assist the annotation in terms of glosses which consist on writing down the sign meaning sign for sign thanks to automatic video processing techniques. In this case using learning data is not suitable since at the first step it will be needed to manually annotate the corpus. Also the context dependency of signs and the co-articulation effect in continuous SL make the collection of learning data very difficult. Here we present a novel approach which uses lexical representations of sign to overcome these problems and image processing techniques to match sign performances to sign representations. Signs are described using Zeebede (ZBD) which is a descriptor of signs that considers the high variability of signs. A ZBD database is used to stock signs and can be queried using several characteristics. From a video corpus sequence features are extracted using a robust body part tracking approach and a semi-automatic sign segmentation algorithm. Evaluation has shown the performances and limitation of the proposed approach.

pdf bib
Méthodologie d’exploration de corpus et de formalisation de règles grammaticales pour les langues des signes (Methodology for corpus exploration and grammatical rule building in Sign Language) [in French]
Michael Filhol | Annelies Braffort
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 2: TALN

2010

pdf bib
Sign Language Corpora for Analysis, Processing and Evaluation
Annelies Braffort | Laurence Bolot | Emilie Chételat-Pelé | Annick Choisier | Maxime Delorme | Michael Filhol | Jérémie Segouat | Cyril Verrecchia | Flora Badin | Nadège Devos
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Sign Languages (SLs) are the visuo-gestural languages practised by the deaf communities. Research on SLs requires to build, to analyse and to use corpora. The aim of this paper is to present various kinds of new uses of SL corpora. The way data are used take advantage of the new capabilities of annotation software for visualisation, numerical annotation, and processing. The nature of the data can be video-based or motion capture-based. The aims of the studies include language analysis, animation processing, and evaluation. We describe here some LIMSI’s studies, and some studies from other laboratories as examples.

pdf bib
Traitement automatique des langues des signes : le projet Dicta-Sign, des corpus aux applications
Annelies Braffort | Michael Filhol | Jérémie Segouat
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Démonstrations

Cet article présente Dicta-Sign, un projet de recherche sur le traitement automatique des langues des signes (LS), qui aborde un grand nombre de questions de recherche : linguistique de corpus, modélisation linguistique, reconnaissance et génération automatique. L’objectif de ce projet est de réaliser trois applications prototypes destinées aux usagers sourds : un traducteur de termes de LS à LS, un outil de recherche par l’exemple et un Wiki en LS. Pour cela, quatre corpus comparables de cinq heures de dialogue seront produits et analysés. De plus, des avancées significatives sont attendues dans le domaine des outils d’annotation. Dans ce projet, le LIMSI est en charge de l’élaboration des modèles linguistiques et participe aux aspects corpus et génération automatique. Nous nous proposons d’illustrer l’état d’avancement de Dicta-Sign au travers de vidéos extraites du corpus et de démonstrations des outils de traitement et de génération d’animations de signeur virtuel.

2007

pdf bib
Description lexicale des signes — Intérêts linguistiques d’un modèle géométrique à dépendances [Lexical Description of Signs — Linguistic Benefits of a Geometric Dependency Model]
Michael Filhol | Annelies Braffort
Traitement Automatique des Langues, Volume 48, Numéro 3 : Modélisation et traitement des langues des signes [Sign Language Modelling and Processing]

2006

pdf bib
Une approche géometrique pour la modélisation des lexiques en langues signées
Michael Filhol
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. REncontres jeunes Chercheurs en Informatique pour le Traitement Automatique des Langues (Posters)

Le contexte est celui d’une plateforme de génération automatique d’énoncés en langue signée, réalisés par un avatar 3D. Il existe quelques uns de ces systèmes aujourd’hui, par exemple le projet VisiCast (Hanke, 2002). Nous revenons ici sur les systèmes de description utilisés pour les unités gestuelles impliquées dans les énoncés, fondés sur un langage peu flexible et guère adaptatif. Nous proposons ensuite une nouvelle approche, constructiviste et géométrique, avec l’objectif de rendre la description des signes des lexiques signés plus adéquate, et par là améliorer leur intégration dans les discours générés.