1.58-bit FLUX
Paper • 2412.18653 • Published • 86
Reproduction of 1.58-bit FLUX (Yang et al., 2024). Ternary ({-1, 0, +1}) quantization of FLUX.1-dev transformer with LoRA compensation, trained via offline flow-matching distillation.
| Model | LoRA Rank | OOD CLIP (% of BF16) | Aesthetic | LPIPS |
|---|---|---|---|---|
| BF16 (baseline) | - | 100% | 5.842 | ref |
| V9b (best r64) | 64 | 90.0% | 5.939 | 0.664 |
| V10b (best r128) | 128 | 90.4% | 5.686 | 0.719 |
| File | Version | Rank | Steps | OOD CLIP |
|---|---|---|---|---|
ternary_distilled_r64_res1024_s4000_fm_lpips1e-01.pt |
V7 | 64 | 4,000 | 88.9% |
ternary_distilled_r64_res1024_s6000_fm_lpips1e-01.pt |
V9b | 64 | 6,000 | 90.0% |
ternary_distilled_r64_res1024_s8000_fm_lpips1e-01.pt |
V9c | 64 | 8,000 | 88.8% |
ternary_distilled_r128_res1024_s6000_fm_lpips1e-01.pt |
V10 | 128 | 6,000 | 87.7% |
ternary_distilled_r128_res1024_s12000_fm_lpips1e-01.pt |
V10b | 128 | 12,000 | 90.4% |
| File | Prompts | Images | Description |
|---|---|---|---|
teacher_dataset_v7.pt |
1,002 | 1,374 | V7 teacher latents |
teacher_dataset_v9b_combined.pt |
2,132 | 2,504 | V9b combined (V7 + new 1,130) |
teacher_dataset_v9c_combined.pt |
4,007 | 4,379 | V9c combined (V9b + new 1,875) |
from diffusers import FluxPipeline
from models.ternary import quantize_to_ternary
import torch
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev",
torch_dtype=torch.bfloat16).to("cuda")
# Quantize to ternary + LoRA
quantize_to_ternary(pipe.transformer, lora_rank=128, svd_init=False)
# Load checkpoint
ckpt = torch.load("ternary_distilled_r128_res1024_s12000_fm_lpips1e-01.pt",
map_location="cuda", weights_only=True)
state = {k: v for k, v in pipe.transformer.named_parameters()}
for name, tensor in ckpt.items():
if name in state:
state[name].data.copy_(tensor.to(torch.bfloat16))
# Generate
image = pipe("A majestic lion resting on a savanna at golden hour",
height=1024, width=1024, num_inference_steps=30,
guidance_scale=3.5).images[0]
OOD CLIP % = 0.0115 × log₂(prompts) + 0.7744@misc{ugonfor2026ternaryflux,
title={Reproducing 1.58-Bit FLUX: Ternary Quantization with LoRA Compensation},
author={Ugon For},
year={2026},
url={https://github.com/ugonfor/1.58bit-flux}
}
Based on 1.58-bit FLUX by Yang et al. Base model: FLUX.1-dev by Black Forest Labs.