OSMoSIS Cross-Encoder OLL Classifier
Five-class response sufficiency classifier using DeBERTa-v3 as a cross-encoder with Ordinal Log-Loss (OLL). Takes (objective, response) pairs as separate inputs with direct cross-attention between the two texts.
OLL replaces standard cross-entropy with a distance-weighted soft target distribution that penalizes predictions proportionally to their ordinal distance from the true label, specifically targeting the NOADDR subtype boundary confusion.
Performance
| Evaluation | Accuracy | Macro Precision | Macro F1 |
|---|---|---|---|
| Yahoo (full test set) | 57.8% | 47.4% | 45.2% |
| Triage held-out | 99.7% | 99.7% | 99.7% |
Per-Class F1 (Yahoo, full test set)
| Class | Precision | Recall | F1 |
|---|---|---|---|
| ADDR_DIRECT | 60.3% | 65.7% | 62.9% |
| ADDR_PARTIAL | 44.7% | 58.5% | 50.6% |
| NOADDR_ON | 65.5% | 57.7% | 61.3% |
| NOADDR_TANGENTIAL | 43.5% | 31.3% | 36.4% |
| NOADDR_OFF | 22.8% | 11.0% | 14.9% |
Per-Class F1 (Triage held-out)
| Class | Precision | Recall | F1 |
|---|---|---|---|
| ADDR_DIRECT | 100.0% | 98.7% | 99.3% |
| ADDR_PARTIAL | 98.7% | 100.0% | 99.3% |
| NOADDR_ON | 100.0% | 100.0% | 100.0% |
| NOADDR_TANGENTIAL | 100.0% | 100.0% | 100.0% |
| NOADDR_OFF | 100.0% | 100.0% | 100.0% |
Progression Across Approaches
| Model | Yahoo Acc | Yahoo Macro F1 | Triage Acc |
|---|---|---|---|
| RepProbe linear (L16) | 39.4% | ~31.0% | 23.7% |
| SetFit MiniLM joint | 44.2% | 35.7% | 86.9% |
| SetFit ModernBERT joint | 51.9% | 41.3% | 94.7% |
| SLM LoRA (Qwen2.5-1.5B) | 52.5% | 30.4% | 97.6% |
| Cross-encoder CE (Exp 4.2) | 57.9% | 42.7% | 99.2% |
| This model (OLL, Exp 5.1) | 57.8% | 45.2% | 99.7% |
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained("KingTechnician/osmosis-crossencoder-oll")
tokenizer = AutoTokenizer.from_pretrained("KingTechnician/osmosis-crossencoder-oll")
labels = ["ADDR_DIRECT", "ADDR_PARTIAL", "NOADDR_ON", "NOADDR_TANGENTIAL", "NOADDR_OFF"]
objective = "What causes rain?"
response = "Rain forms when water vapor in the atmosphere condenses into droplets."
inputs = tokenizer(objective, response, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
prediction = logits.argmax(dim=-1).item()
print(f"Prediction: {labels[prediction]}")
# Output: Prediction: ADDR_DIRECT
Architecture
This is a cross-encoder, not a bi-encoder. The model processes
[CLS] objective [SEP] response [SEP] as a single input, allowing full
self-attention between objective and response tokens.
Training
- Base model: MoritzLaurer/deberta-v3-base-zeroshot-v2.0 (NLI-pretrained)
- Data: Yahoo Answers OSMoSIS (24,890) + Triage synthetic (1,492) = 26,382 pairs
- Epochs: 4 (best from 10-epoch sweep; overfitting observed after epoch 4)
- LR: 2e-05 with linear warmup (10%)
- Loss: Ordinal Log-Loss (OLL) with class weights — replaces CE from Exp 4.2
- Ordinal ordering: ADDR_DIRECT < ADDR_PARTIAL < NOADDR_ON < NOADDR_TANGENTIAL < NOADDR_OFF
- Final train accuracy: 75.5%
Why OLL?
Standard cross-entropy treats all misclassifications equally. OLL encodes the ordinal structure of the label space: confusing NOADDR_TANGENTIAL with NOADDR_ON incurs less penalty than confusing it with ADDR_DIRECT. This is a single loss function swap with no architectural changes from the Exp 4.2 cross-encoder.
Reference: Castagnos et al., "An Ordinal Log-Loss for Multi-Class Classification with Ordinal Labels", COLING 2022.
Labels
| Label | Description |
|---|---|
| ADDR_DIRECT | Response directly and completely addresses the objective |
| ADDR_PARTIAL | Response partially addresses the objective |
| NOADDR_ON | Response is on-topic but does not address the objective |
| NOADDR_TANGENTIAL | Response is tangentially related to the objective |
| NOADDR_OFF | Response is completely off-topic |
- Downloads last month
- 40
Model tree for KingTechnician/osmosis-crossencoder-oll
Base model
microsoft/deberta-v3-base