nspam — on-device Nostr reply-spam classifier

Scores bundles of Nostr kind:1 reply notes from a single author as real or bot. Designed for mobile clients (Kotlin / Swift / JS).

Model type: LightGBM gradient boosted trees over hashed character + word n-grams and structural features.
Input: 1–10 recent reply notes (events with e tags) from one pubkey.
Output: calibrated probability ∈ [0, 1] that the author is a reply-spammer.
Runtime: LightGBM4j (JNI) or ONNX Runtime on device.

Intended use

Client-side filtering/ranking of replies in Nostr apps. Use the score to deprioritize or hide likely spam replies. Combine with user mutes and follow graph rather than hard-blocking on a single score.

Holdout metrics (v2.2)

metric	value
average precision	0.9666
ROC AUC	0.9800
precision @ recall 0.9	0.9270
per-author accuracy	0.8780

Limitations

Reply-focused. Trained on reply notes only — not designed for scoring feed posts or other event kinds.
English-heavy training data. Non-Latin scripts underrepresented.
Adversarial drift. Spammers evolve. Retrain periodically.
Cold start. Accounts with <3 replies have limited signal.

License

MIT.

Downloads last month: -; Downloads are not tracked for this model. How to track