Gemma 4 31B Opus Reasoning Adapter v1

This is a private QLoRA adapter for google/gemma-4-31B-it, fine-tuned on a cleaned subset of Crownelius/Opus-4.6-Reasoning-2100x-formatted.

The goal of this run was simple: produce a Gemma 4 31B reasoning adapter trained only on Opus-style reasoning data, without mixing in unrelated instruction corpora or agent traces.

Base Model

  • Base model: google/gemma-4-31B-it
  • Adapter type: LoRA / QLoRA (peft)
  • Quantization: 4-bit NF4
  • Precision: BF16 compute

Dataset

Source dataset:

Local filtering applied before training:

  • Removed duplicate user prompts
  • Removed obviously bad prompt families and formatting noise
  • Kept reasoning-style rows only

Final local dataset stats:

  • Rows in source: 2160
  • Rows kept: 2025
  • Train rows: 1924
  • Validation rows: 101
  • Category mix: 1899 math, 126 code

Training Setup

  • Max sequence length: 4096
  • Epochs: 2
  • Learning rate: 1e-4
  • Per-device batch size: 1
  • Gradient accumulation: 8
  • Hardware: NVIDIA GH200

LoRA target modules were adapted for Gemma 4 wrapped linear layers:

  • q_proj.linear
  • k_proj.linear
  • v_proj.linear
  • o_proj.linear
  • gate_proj.linear
  • up_proj.linear
  • down_proj.linear

Validation Metrics

Final metrics from the completed run:

  • Eval loss: 3.6018
  • Eval perplexity: 36.66
  • Train runtime: 3723s
  • Epochs completed: 2.0

Published Base-Model Reference Benchmarks

The table below is included for context and comes from Google's official Gemma 4 31B Instruct model card. These are published base-model reference scores for google/gemma-4-31B-it, not adapter-specific evaluation results for this repository.

Benchmark Gemma 4 31B Gemma 3 27B (no think)
MMLU-Pro 85.2% 67.6%
AIME 2026 no tools 89.2% 20.8%
LiveCodeBench v6 80.0% 29.1%
Codeforces Elo 2150 110
GPQA Diamond 84.3% 42.4%
Tau2 (average over 3) 76.9% 16.2%
HLE no tools 19.5% -
HLE with search 26.5% -
BigBench Extra Hard 74.4% 19.3%
MMMLU 88.4% 70.7%
MMMU Pro 76.9% 49.7%
OmniDocBench 1.5 (lower is better) 0.131 0.365
MATH-Vision 85.6% 46.0%
MRCR v2 8 needle 128k (average) 66.4% 13.5%

Source:

Usage

This repository contains a PEFT adapter, not a fully merged standalone model.

Load it with:

  • Base model: google/gemma-4-31B-it
  • Adapter: this repository
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

base_id = "google/gemma-4-31B-it"
adapter_id = "kai-os/gemma4-opus-reasoning-adapter-v1"

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(base_id)
base = AutoModelForCausalLM.from_pretrained(
    base_id,
    device_map="auto",
    quantization_config=bnb,
    torch_dtype=torch.bfloat16,
)
model = PeftModel.from_pretrained(base, adapter_id)

Notes

  • This is a reasoning-focused adapter, not a benchmark-optimized release.
  • The benchmark table above is for the published base model, not this adapter.
  • It is best treated as an experimental distilled reasoning adapter.

Acknowledgements

  • Google for Gemma 4
  • The Opus reasoning dataset authors and maintainers
  • Hugging Face transformers, peft, and datasets
Downloads last month
256
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 1 Ask for provider support

Model tree for kai-os/gemma4-31b-Opus-4.6-reasoning

Adapter
(10)
this model

Dataset used to train kai-os/gemma4-31b-Opus-4.6-reasoning