Outlier
Ternary MoE + Apple Silicon quantization for on-device AI.
Desktop app for Mac. No token caps. Free forever.
outlier.host · Discord · Founders ($200 lifetime)
Try it now
| Download for Mac | outlier.host — 9.2 MB DMG, v1.4 shipping |
| Join Discord | discord.gg/Hapennmdn9 |
| Support the mission | Founders lifetime $200 · 500-seat cap |
What is Outlier?
A Mac-native AI platform with curated open-weights models for Apple Silicon. Built solo in 20 days on a Mac Studio, under $1,200 of compute spend. Three U.S. provisional patents filed on the underlying ternary MoE architecture.
Two tracks on HuggingFace:
- Research MoE — ternary Mixture-of-Experts overlays on frozen Qwen2.5 bases. 10B, 40B, 70B, 150B scales, MMLU-verified.
- Apple Silicon conversions — MLX 4-bit builds of strong open-weights models, tuned for the Outlier desktop app and usable standalone via
mlx_lm.
Research line (MMLU verified)
All values at n=14,042 · lm-evaluation-harness v0.4.9.1 · bf16 5-shot · source weights unchanged since Day 13 (2026-04-13).
| Scale | MMLU | Stderr | Base | Repo |
|---|---|---|---|---|
| Outlier-10B V3.3 | 70.87% | ±0.37% | Qwen2.5-7B-Instruct | Outlier-Ai/Outlier-10B |
| Outlier-40B V3.3 | 77.80% | ±0.33% | Qwen2.5-14B-Instruct | Outlier-Ai/Outlier-40B |
| Outlier-70B V3.3 (alpha-fixed) | 83.10% | ±0.30% | Qwen2.5-32B-Instruct | Outlier-Ai/Outlier-70B-V3.3 |
| Outlier-150B V3.2 | 84.46% | ±0.29% | Qwen2.5-72B-Instruct | Outlier-Ai/Outlier-150B-V3.2 |
Architecture: shared full-precision FFN plus gated ternary expert FFN per layer. Overlay checkpoints load on top of frozen Qwen2.5 bases. 70B V3.3 alpha-fix overlay is 15 KB, trained in 18 minutes on one B200, +1.61pp MMLU over V3.2.
Shipping tier for Apple Silicon
MLX 4-bit builds, verified on Mac Studio M1 Ultra 64GB. Bundled in the Outlier desktop app tier library.
| Tier | Base | Peak RAM | Speed | Repo |
|---|---|---|---|---|
| Nano | Qwen3-1.7B | ~2 GB | bench pending | Outlier-Nano-1.7B-MLX-4bit |
| Lite | Qwen2.5-7B | 4.47 GB | 71.30 tok/s | Outlier-Lite-7B-MLX-4bit |
| Compact | Qwen2.5-14B | 8.24 GB | 37.26 tok/s | Outlier-Compact-14B-MLX-4bit |
Plus cross-platform GGUF builds for llama.cpp / Ollama / LM Studio / Jan: Lite 7B, Compact 14B, Max 32B.
Apple Silicon conversions
MLX 4-bit conversions of strong open-weights models, Mac-tuned, upstream-named for HF search discovery:
- DeepSeek R1 Distill — 32B · 14B · Llama-8B · Qwen-7B
- Qwen3 — 32B · 14B · 8B · 4B
- Qwen Coder — Qwen3-Coder-30B-A3B · Qwen2.5-Coder-32B · 14B · 7B
- Other — QwQ-32B · Yi-Coder-9B · Phi-4-mini · gpt-oss-20b · SmolLM3-3B · Gemma 3 27B · Gemma 3 4B
Every conversion is a faithful port of upstream weights — capability credit belongs to the upstream authors. We add the MLX 4-bit packaging and desktop-app integration.
Collections
Patents + citation
Architecture, training pipeline, and inference engine covered by US provisional patents 64/026,886, 64/030,368, and 64/034,028 (Kerr & Company LLC, 2026).
@misc{kerr2026outlier,
title = {Outlier: Ternary Mixture-of-Experts for On-Device AI},
author = {Kerr, Matthew},
year = {2026},
url = {https://huggingface.co/Outlier-Ai}
}
Contact
Matt Kerr · outlier.host · @mattkerr09 · matt@outlier.host
Built solo in Grand Rapids, Michigan.