KovaMind Emotion v2
A 10-class emotion classifier fine-tuned for conversational AI memory systems. Uses an Inside Out 2 inspired taxonomy. v2 extends Kova-Mind/emotion-v1 with targeted training on pipeline-style bracket-tag fragments (X [positive], X [negative], X [neutral]) and entity fragments with embedded sentiment.
What v2 fixes
v1 shipped with excellent accuracy on natural conversational sentences, but two real-world failure modes appeared in production patterns:
- Bracket-tag blindness. Fragments like
hiking [positive] β Weekends activity.were consistently labeledneutralinstead ofjoy. The[positive]/[negative]bracket is a Kova pipeline metadata tag the model had never seen in training. - Entity fragments with embedded sentiment. Fragments like
lake (place) β Location where Rex loves swimming.were labeledneutralbecause v1 read only the nominal text; it missed the embedded positive signal.
v2 is trained on 706 additional Opus-labeled Kova-style examples that target these gaps.
Head-to-head vs v1
Evaluated against Claude Opus as fragment-only oracle (Opus reads each fragment in isolation and labels what it expresses).
| Benchmark | n | v1 | v2 |
|---|---|---|---|
| 21 production bracket-tag patterns | 21 | 17/21 = 81.0 % | 21/21 = 100.0 % |
| 17 adversarial sanity cases | 17 | 14/17 = 82.4 % | 15/17 = 88.2 % |
| 50 recent production patterns | 50 | 42/50 = 84.0 % | 40/50 = 80.0 % |
| Combined | 88 | 73/88 = 83.0 % | 76/88 = 86.4 % |
- +3.4 pts combined accuracy
- +19 pts on bracket-tag patterns (the target gap)
- Zero regressions on the 21 bracket patterns
- The v2 dip on the 50-pattern set is 2 cases where v2 over-called joy on entity fragments with incidental third-party verbs; both v1 and v2 stayed within Β±5 pts noise of the Opus oracle scoring.
Latency
Inference latency unchanged from v1: ~5 ms per call on an A40 GPU (after warmup). Well under the 10 ms production ceiling.
Classes
10-class Inside Out 2 taxonomy:
| ID | Label |
|---|---|
| 0 | joy |
| 1 | sadness |
| 2 | anger |
| 3 | fear |
| 4 | disgust |
| 5 | anxiety |
| 6 | embarrassment |
| 7 | envy |
| 8 | ennui |
| 9 | neutral |
Same label IDs as v1 β drop-in replacement for any code loading Kova-Mind/emotion-v1.
Training Methodology
Same recipe as v1, with expanded domain data:
- Base model: SamLowe/roberta-base-go_emotions
- Head: fresh 10-class classification head (random init, 28β10 reinit from base)
- Domain data (706 Opus-labeled examples, 3Γ weight):
- 21 real production bracket-tag patterns labeled by Opus
- 500 synthetic bracket-tag examples across all 10 emotions
- 185 natural conversational sentences (regression protection for general text)
- Background data: Google GoEmotions
simplifiedconfig, 28 labels remapped to the 10 Inside Out classes (rarest-label-wins tie-break) - Balancing: max 3000 samples/class, min 200 samples/class
- Hyperparameters: 4 epochs, batch 32, learning rate 2e-5, AdamW, FP16, seed 42, warmup ratio 0.1, weight decay 0.01
- Best checkpoint: epoch 2 (auto-selected via
load_best_model_at_endon macro F1) - Hardware: 1Γ NVIDIA A40 (48GB)
- Wall time: ~3 min 18 s pure training
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("Kova-Mind/emotion-v2")
model = AutoModelForSequenceClassification.from_pretrained("Kova-Mind/emotion-v2")
model.eval()
text = "croissants [positive] β User A loves them exclusively."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=256)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)
pred_id = torch.argmax(probs, dim=-1).item()
print(f"{model.config.id2label[pred_id]} ({probs[0, pred_id]:.3f})")
# β joy (0.97)
Or with the pipeline API:
from transformers import pipeline
clf = pipeline("text-classification", model="Kova-Mind/emotion-v2", top_k=None)
print(clf("hiking [positive] β Weekends activity."))
# β joy (top class)
Limitations
- English and a few non-English bracket fragments only. Non-English bracket patterns were included as training data (Arabic, Chinese) but broader multilingual coverage is out of scope.
- Conversational AI context. Optimized for agent-memory ingestion of natural user dialog. May underperform on social-media, legal, or heavily-stylized text.
- 10-class granularity. Does not capture finer emotional shades (e.g. admiration vs pride, grief vs sadness, amusement vs joy). Those get absorbed into the closest bucket.
- Single-label output. Trained with single-label resolution (rarest-wins tie-break). For multi-label emotion, a different head is needed.
- Entity-fragment sentiment. v2 occasionally over-calls
joyon factual third-party fragments with incidental positive verbs (e.g. "Alex (person) β Promoted to VP at Google"). v3 will target this with a curated third-party-factual training set. - envy class weakly supervised. GoEmotions does not contain an envy label; v2 sees only ~37 domain envy examples. Production coverage is limited.
Changelog from v1
- +500 synthetic bracket-tag training examples (all 10 emotions)
- +21 real production bracket-tag patterns (Opus-labeled)
- +185 natural-sentence regression examples (Kova-style conversational text)
- Same base model, same 10-class taxonomy, same hyperparameters, same label IDs.
Citation
@misc{capo2026kovamindemotionv2,
author = {Capo, Alejandro},
title = {KovaMind Emotion v2: Bracket-Tag-Aware Emotion Classification for Conversational Memory Pipelines},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/Kova-Mind/emotion-v2}}
}
Built By
kovamind.io β private, customer-owned AI memory infrastructure.
- Downloads last month
- -
Model tree for Kova-Mind/emotion-v2
Base model
SamLowe/roberta-base-go_emotions