Master TAL - MSc. NLP
Course Unit
Spoken Corpora
UE
702
EC
2
Hours
30h
Course Description
This course introduces the means of oral communication (sound, images and facial geometry and the vocal tract, aerodynamic parameters, gestures) which can be collected. The technologies used to this end are also introduced (microphone, IRM, ultrasound, electromagnetic articulography) as well as the technical or ethical constraints that these methods of collection present. The second part of the class develops conceptualisation of annotated corpora. For example, we develop notions of automatic tools used for annotation. The last part of the course is consecrated to annotation software and the management of corpora, which are vital for the proper exploitation after the annotation process.
Learning Outcome
-
Knowledge of specificity of spoken corpora
-
Design of the content of spoken corpora
-
Annotation of spoken corpora
Prerequisites
-
The courses for the first semester of the master do not have prerequisites other than those defined for the specialisation
Targeted Skills
- Capacity to collect, structure, and represent data (sound, text, images,… )
- Combine and utilise interdisciplinary skills and know-how in the aims of creating innovative solutions
More Informations
Bibliography
- To be completed
Course URL – Arche
- To be completed
Link with other courses
- 702-EC2, 803 and 902-EC2
Evaluation procedures
Number of Tests
- 2
Nature of the tests
- labs
- final exam
Group work
- N/A
Combine with other specialization
- No