Gemma 4 31B Opus Reasoning Adapter v1
This is a private QLoRA adapter for google/gemma-4-31B-it, fine-tuned on a cleaned subset of Crownelius/Opus-4.6-Reasoning-2100x-formatted.
The goal of this run was simple: produce a Gemma 4 31B reasoning adapter trained only on Opus-style reasoning data, without mixing in unrelated instruction corpora or agent traces.
Base Model
- Base model:
google/gemma-4-31B-it - Adapter type: LoRA / QLoRA (
peft) - Quantization: 4-bit NF4
- Precision: BF16 compute
Dataset
Source dataset:
Local filtering applied before training:
- Removed duplicate user prompts
- Removed obviously bad prompt families and formatting noise
- Kept reasoning-style rows only
Final local dataset stats:
- Rows in source:
2160 - Rows kept:
2025 - Train rows:
1924 - Validation rows:
101 - Category mix:
1899math,126code
Training Setup
- Max sequence length:
4096 - Epochs:
2 - Learning rate:
1e-4 - Per-device batch size:
1 - Gradient accumulation:
8 - Hardware: NVIDIA GH200
LoRA target modules were adapted for Gemma 4 wrapped linear layers:
q_proj.lineark_proj.linearv_proj.linearo_proj.lineargate_proj.linearup_proj.lineardown_proj.linear
Validation Metrics
Final metrics from the completed run:
- Eval loss:
3.6018 - Eval perplexity:
36.66 - Train runtime:
3723s - Epochs completed:
2.0
Published Base-Model Reference Benchmarks
The table below is included for context and comes from Google's official Gemma 4 31B Instruct model card. These are published base-model reference scores for google/gemma-4-31B-it, not adapter-specific evaluation results for this repository.
| Benchmark | Gemma 4 31B | Gemma 3 27B (no think) |
|---|---|---|
| MMLU-Pro | 85.2% | 67.6% |
| AIME 2026 no tools | 89.2% | 20.8% |
| LiveCodeBench v6 | 80.0% | 29.1% |
| Codeforces Elo | 2150 | 110 |
| GPQA Diamond | 84.3% | 42.4% |
| Tau2 (average over 3) | 76.9% | 16.2% |
| HLE no tools | 19.5% | - |
| HLE with search | 26.5% | - |
| BigBench Extra Hard | 74.4% | 19.3% |
| MMMLU | 88.4% | 70.7% |
| MMMU Pro | 76.9% | 49.7% |
| OmniDocBench 1.5 (lower is better) | 0.131 | 0.365 |
| MATH-Vision | 85.6% | 46.0% |
| MRCR v2 8 needle 128k (average) | 66.4% | 13.5% |
Source:
Usage
This repository contains a PEFT adapter, not a fully merged standalone model.
Load it with:
- Base model:
google/gemma-4-31B-it - Adapter: this repository
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch
base_id = "google/gemma-4-31B-it"
adapter_id = "kai-os/gemma4-opus-reasoning-adapter-v1"
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(base_id)
base = AutoModelForCausalLM.from_pretrained(
base_id,
device_map="auto",
quantization_config=bnb,
torch_dtype=torch.bfloat16,
)
model = PeftModel.from_pretrained(base, adapter_id)
Notes
- This is a reasoning-focused adapter, not a benchmark-optimized release.
- The benchmark table above is for the published base model, not this adapter.
- It is best treated as an experimental distilled reasoning adapter.
Acknowledgements
- Google for Gemma 4
- The Opus reasoning dataset authors and maintainers
- Hugging Face
transformers,peft, anddatasets
- Downloads last month
- 256
Model tree for kai-os/gemma4-31b-Opus-4.6-reasoning
Base model
google/gemma-4-31B-it