Outlier

Ternary MoE + Apple Silicon quantization for on-device AI.
Desktop app for Mac. No token caps. Free forever.

outlier.host · Discord · Founders ($200 lifetime)

Try it now


Download for Mac	outlier.host — 9.2 MB DMG, v1.4 shipping
Join Discord	discord.gg/Hapennmdn9
Support the mission	Founders lifetime $200 · 500-seat cap

What is Outlier?

A Mac-native AI platform with curated open-weights models for Apple Silicon. Built solo in 20 days on a Mac Studio, under $1,200 of compute spend. Three U.S. provisional patents filed on the underlying ternary MoE architecture.

Two tracks on HuggingFace:

Research MoE — ternary Mixture-of-Experts overlays on frozen Qwen2.5 bases. 10B, 40B, 70B, 150B scales, MMLU-verified.
Apple Silicon conversions — MLX 4-bit builds of strong open-weights models, tuned for the Outlier desktop app and usable standalone via mlx_lm.

Research line (MMLU verified)

All values at n=14,042 · lm-evaluation-harness v0.4.9.1 · bf16 5-shot · source weights unchanged since Day 13 (2026-04-13).

Scale	MMLU	Stderr	Base	Repo
Outlier-10B V3.3	70.87%	±0.37%	Qwen2.5-7B-Instruct	Outlier-Ai/Outlier-10B
Outlier-40B V3.3	77.80%	±0.33%	Qwen2.5-14B-Instruct	Outlier-Ai/Outlier-40B
Outlier-70B V3.3 (alpha-fixed)	83.10%	±0.30%	Qwen2.5-32B-Instruct	Outlier-Ai/Outlier-70B-V3.3
Outlier-150B V3.2	84.46%	±0.29%	Qwen2.5-72B-Instruct	Outlier-Ai/Outlier-150B-V3.2

Architecture: shared full-precision FFN plus gated ternary expert FFN per layer. Overlay checkpoints load on top of frozen Qwen2.5 bases. 70B V3.3 alpha-fix overlay is 15 KB, trained in 18 minutes on one B200, +1.61pp MMLU over V3.2.

Shipping tier for Apple Silicon

MLX 4-bit builds, verified on Mac Studio M1 Ultra 64GB. Bundled in the Outlier desktop app tier library.

Tier	Base	Peak RAM	Speed	Repo
Nano	Qwen3-1.7B	~2 GB	bench pending	Outlier-Nano-1.7B-MLX-4bit
Lite	Qwen2.5-7B	4.47 GB	71.30 tok/s	Outlier-Lite-7B-MLX-4bit
Compact	Qwen2.5-14B	8.24 GB	37.26 tok/s	Outlier-Compact-14B-MLX-4bit

Plus cross-platform GGUF builds for llama.cpp / Ollama / LM Studio / Jan: Lite 7B, Compact 14B, Max 32B.

Apple Silicon conversions

MLX 4-bit conversions of strong open-weights models, Mac-tuned, upstream-named for HF search discovery:

DeepSeek R1 Distill — 32B · 14B · Llama-8B · Qwen-7B
Qwen3 — 32B · 14B · 8B · 4B
Qwen Coder — Qwen3-Coder-30B-A3B · Qwen2.5-Coder-32B · 14B · 7B
Other — QwQ-32B · Yi-Coder-9B · Phi-4-mini · gpt-oss-20b · SmolLM3-3B · Gemma 3 27B · Gemma 3 4B

Every conversion is a faithful port of upstream weights — capability credit belongs to the upstream authors. We add the MLX 4-bit packaging and desktop-app integration.

Collections

Patents + citation

Architecture, training pipeline, and inference engine covered by US provisional patents 64/026,886, 64/030,368, and 64/034,028 (Kerr & Company LLC, 2026).

@misc{kerr2026outlier,
  title   = {Outlier: Ternary Mixture-of-Experts for On-Device AI},
  author  = {Kerr, Matthew},
  year    = {2026},
  url     = {https://huggingface.co/Outlier-Ai}
}

Contact

Matt Kerr · outlier.host · @mattkerr09 · matt@outlier.host

Built solo in Grand Rapids, Michigan.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support