DEFT 2019

Défi Fouille de Textes@TALN 2019

Information Extraction and Retrieval in French clinical cases


The 2019 issue of the DEFT challenge is dedicated to the analysis of clinical cases in French. This issue is composed of three tasks on information retrieval and extraction. This is the first challenge in which clinical texts in French are to be processed.

What are clinical cases?
Clinical cases describe clinical situations of patients, real or fake. The cases are published in various sources (scientific, didactic, associative, legal...). They are de-identified. Their purpose is to present situations that are typical (as in didactic sources) or rare (as in scientific sources).

Global information on the corpus
The corpus used in this challenge is part of a larger corpus with clinical cases, with more complete annotations and associated information [1]. For DEFT 2019, the Organizers focused on clinical cases associated with keywords and discussions. These clinical cases are related to various medical specialties (cardiology, urology, oncology, obstetrics, pulmonology, gastro-enterology...). They have been published in different French-speaking countries (France, Belgium, Switzerland, Canada, African countries, tropical countries...).
The reference data are consensual and obtained from two independent annotations.

[1] N Grabar, V Claveau, C Dalloux. CAS: French Corpus with Clinical Cases. LOUHI 2018, p. 1-7

Access to data
Access to data is possible only after the user agreement is signed by all the team members. The participants can engage in one or more tasks. When getting the data, the participants are committed to submit the results for at least one task.

Tasks Description

Proposed tasks are:

  1. Task 1: Indexing of clinical cases
    • Purpose: to identify, in the list of keywords, the keywords corresponding to a given couple clinical case/discussion
    • Input: couples clinical case/discussion, indication of the expected number of keywords, whole set of keywords
    • Output: pairing of keywords with couples clinical case/discussion
    • Remarks: a given keyword may be associated with several couples clinical case/discussion, some keywords from the whole set of training (and test) are not associated with clinical cases/discussions. The keywords are defined and chosen by the Authors
    • Evaluation: the main evaluation measure is Mean Average Precision (MAP), the second evaluation measure is Prec@N (precision at rank N), where N corresponds to the number of the expected keywords. Normalization (inflection, affixation) of keywords for a better comparison and evaluation will be done by the organizers.

  2. Task 2: Semantic similarity between clinical cases and discussions
    • Purpose: to pair a given clinical case with the corresponding discussion
    • Input: a set of clinical cases, a set of discussions
    • Output: paring between clinical cases and discussions
    • Remarks: one discussion may be associated with more than one clinical case
    • Evaluation: boolean

  3. Task 3: Information extraction
    • Purpose: to detect, in clinical cases, demographic and clinical information
      Four types of information are aimed:
      • the age of the patient concerned by the case, at the moment of the last clinical event described, normalized to integer (e.g., 0 for babies younger than 1 year, 1 for babies between 1 and 2 years, 20 for twentyish patients, etc.).
      • the gender of the patient concerned by the case. Two values are possible: female, male (there is no other possibilities).
      • the reason of the appointment or hospitalization, for the last clinical event. This category is usually concerned by pathologies, signs and symptoms, sometimes accidents. The clinical follow-up is in the continuity of preceding clinical events and is not considered as proper reason.
      • the outcome among the five possible values: 1° recovery (the clinical problem described has been removed and the patient has fully recovered), 2° improvement (the clinical condition is improved but it is impossible to conclude to full recovery), 3° stable (either the condition remains stable, or it is impossible to define whether there is an improvement or worsening), 4° worsening (the clinical condition is getting worse), or 5° death (when the death is directly related to the clinical case).
    • Input: a set of clinical cases
    • Output: values extracted for the four types of information aimed
    • Remarks: When a document is related to several patients, ages and genders of each one must be identified (for instance, in the case of graft from one donor given to two patients successively, the age and gender of the two beneficiaries must be identified). It is not necessary to link the age with the corresponding gender. When several ages are mentioned for a given patient (the current age and the ages in his medical history), only the age related to the clinical case described must be extracted. Few documents do not permit to define all the categories, in which case the default value NUL is to be used.
    • Evaluation: Values of age, gender and outcome will be evaluated through strict comparison (same value between the reference and extracted values). It is not required to indicate the text spans which provide these values. The reason is evaluated by comparison and the intersection rate between the textual portion extracted and the reference textual portion.

Scripts d'évaluation

Archive: scripts-eval-deft2019.tar.gz
  • Task 1: python3 -i file-t1.csv -r TRAIN-T1/donnees-t1-ref.csv -b baseline.csv
  • Task 2: python3 -i file-t2.csv -r TRAIN-T2/donnees-t2-ref.csv -d TRAIN-T2/donnees-t2-disc.csv
  • Task 3: python3 -i file-t3.csv -r TRAIN-T3/donnees-t3-ref.csv


  • Are the tasks constrained by the data provided or can other data be used? It is possible to use other data than those provided in the corpus with two caveats: (1) in the article that will describe the approach used and during the presentation during the closing workshop, it will be necessary to specify what are the external resources used, and (2) if data came from the web, it must be ensured that the data retrieved do not correspond to the original data used to compile the corpus (see point no. 6 from the agreement: "do not search the internet for the originals of the data provided").
  • Unless specific demands for translations, please look at the French FAQ

Programme de l'atelier

D'ici quelques mois...

Actes de l'atelier

D'ici quelques mois...


Program Committee

  • Patrice BELLOT (LSIS, Aix-Marseille Université)
  • Leonardo CAMPILLOS LLANOS (LIMSI, CNRS, Université Paris-Saclay ; Madrid)
  • Natalia GRABAR (STL, CNRS, Université de Lille)
  • Cyril GROUIN (LIMSI, CNRS, Université Paris-Saclay)
  • Vincent GUIGUE (LIP6, Sorbonne Université)
  • Thierry HAMON (LIMSI, CNRS, Université Paris-Saclay ; Université Paris XIII)
  • Véronique MORICEAU (LIMSI, Université Paris-Sud, Université Paris-Saclay ; IRIT)
  • Fleur MOUGIN (Bordeaux Population Health, Université de Bordeaux)
  • Mathieu ROCHE (TETIS, CIRAD)
  • Patrick RUCH (HEG Geneva, BiTeM)
  • Frantz THIESSARD (Bordeaux Population Health, Université de Bordeaux, Inserm ; CHU de Bordeaux, SIM pôle santé publique, unité médicale Informatique et archivistique médicales)

Organization Committee

  • Natalia GRABAR (STL, CNRS, Université de Lille)
  • Cyril GROUIN (LIMSI, CNRS, Université Paris-Saclay)
  • Thierry HAMON (LIMSI, CNRS, Université Paris-Saclay ; Université Paris XIII)