CAD → MIP Priority Classifier

Fine-tuned Qwen3-0.6B via GRPO for classifying DAC codes to MIP (Multiannual Indicative Programme) country priorities in Expertise France project documents.

Input: DAC code + project excerpt + country priority list
Output: Priority number (integer 1–N)
Reward: Composite — 0.9 × correctness + 0.1 × clean termination (possible values: 0.0, 0.1, 0.9, 1.0)

Training Data

The model was trained on a synthetic dataset (JZSG/ef_training_datasets) generated with Gemini 3.0 Flash from 1 393 original labeled examples, using a two-phase pipeline:

Variations on existing examples (paraphrasing, country/code permutations)
New country × DAC code combinations not seen in the original data

The synthetic dataset was split into train / test sets. Training was done on the train split; all evaluation figures below are on the held-out test split.

Results

Metric	Old model	New model (+ synth)
Overall accuracy	72.1%	86.2%
Valid rate	96.6%	96.8%
Accuracy on valid	74.6%	89.1%

By DAC code frequency:

Frequency bucket	Old model	New model
Very frequent	80.2%	86.4%
Frequent	74.4%	89.4%
Medium	67.0%	82.9%
Rare	65.4%	65.4%

Note on comparison: both models are evaluated on the same synthetic test split. The old model was trained on the original (non-synthetic) dataset; the new model was trained on the synthetic train split. Gains on frequent and medium codes are likely due to the increased training data for those codes, while the lack of improvement on rare codes is expected since not a lot of synthetic data was generated for those.

Training Setup

Parameter	Value
Base model	Qwen3-0.6B
Method	GRPO
Infrastructure	Jean Zay (IDRIS)
Temperature	1.2

Training pipelines and evaluation scripts: Pleias/EF_training (private).

Downloads last month: 6

Safetensors

Model size

0.6B params

Tensor type

F32

Video Preview

Reinforcement Learning

Model tree for JZSG/ef_qwen3_0.6B_cad

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Finetuned

(796)

this model

JZSG
/

ef_qwen3_0.6B_cad

CAD → MIP Priority Classifier

Training Data

Results

Training Setup

Model tree for JZSG/ef_qwen3_0.6B_cad

Dataset used to train JZSG/ef_qwen3_0.6B_cad