CAD β†’ MIP Priority Classifier

Fine-tuned Qwen3-0.6B via GRPO for classifying DAC codes to MIP (Multiannual Indicative Programme) country priorities in Expertise France project documents.

Input: DAC code + project excerpt + country priority list
Output: Priority number (integer 1–N)
Reward: Composite β€” 0.9 Γ— correctness + 0.1 Γ— clean termination (possible values: 0.0, 0.1, 0.9, 1.0)


Training Data

The model was trained on a synthetic dataset (JZSG/ef_training_datasets) generated with Gemini 3.0 Flash from 1 393 original labeled examples, using a two-phase pipeline:

  1. Variations on existing examples (paraphrasing, country/code permutations)
  2. New country Γ— DAC code combinations not seen in the original data

The synthetic dataset was split into train / test sets. Training was done on the train split; all evaluation figures below are on the held-out test split.


Results

Model comparison

Metric Old model New model (+ synth)
Overall accuracy 72.1% 86.2%
Valid rate 96.6% 96.8%
Accuracy on valid 74.6% 89.1%

By DAC code frequency:

Frequency bucket Old model New model
Very frequent 80.2% 86.4%
Frequent 74.4% 89.4%
Medium 67.0% 82.9%
Rare 65.4% 65.4%

Note on comparison: both models are evaluated on the same synthetic test split. The old model was trained on the original (non-synthetic) dataset; the new model was trained on the synthetic train split. Gains on frequent and medium codes are likely due to the increased training data for those codes, while the lack of improvement on rare codes is expected since not a lot of synthetic data was generated for those.


Training Setup

Parameter Value
Base model Qwen3-0.6B
Method GRPO
Infrastructure Jean Zay (IDRIS)
Temperature 1.2

Training pipelines and evaluation scripts: Pleias/EF_training (private).

Downloads last month
6
Safetensors
Model size
0.6B params
Tensor type
F32
Β·
Video Preview
loading

Model tree for JZSG/ef_qwen3_0.6B_cad

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(796)
this model

Dataset used to train JZSG/ef_qwen3_0.6B_cad