nspam β on-device Nostr reply-spam classifier
Scores bundles of Nostr kind:1 reply notes from a single author as
real or bot. Designed for mobile clients (Kotlin / Swift / JS).
- Model type: LightGBM gradient boosted trees over hashed character + word n-grams and structural features.
- Input: 1β10 recent reply notes (events with
etags) from one pubkey. - Output: calibrated probability β [0, 1] that the author is a reply-spammer.
- Runtime: LightGBM4j (JNI) or ONNX Runtime on device.
Intended use
Client-side filtering/ranking of replies in Nostr apps. Use the score to deprioritize or hide likely spam replies. Combine with user mutes and follow graph rather than hard-blocking on a single score.
Holdout metrics (v2.2)
| metric | value |
|---|---|
| average precision | 0.9666 |
| ROC AUC | 0.9800 |
| precision @ recall 0.9 | 0.9270 |
| per-author accuracy | 0.8780 |
Limitations
- Reply-focused. Trained on reply notes only β not designed for scoring feed posts or other event kinds.
- English-heavy training data. Non-Latin scripts underrepresented.
- Adversarial drift. Spammers evolve. Retrain periodically.
- Cold start. Accounts with <3 replies have limited signal.
License
MIT.