Gemma-4-E2B Hard Reasoning

A fine-tuned version of Google's Gemma-4-E2B-it specialized for complex reasoning tasks with chain-of-thought (CoT) capabilities.

Model Description

This model was fine-tuned on 13,181 high-quality reasoning examples spanning mathematics, science, logic, and analytical thinking. The training data includes detailed thinking processes wrapped in <think> tags, teaching the model to show its reasoning step-by-step.

Key Features

  • 🧠 Chain-of-thought reasoning: Model outputs thinking process in <think> tags
  • 📚 Multi-domain expertise: Math, physics, computer science, logic, philosophy
  • 🎯 Hard problems focus: Trained on "hardcore" difficulty examples
  • Efficient LoRA fine-tuning: Only 32M trainable parameters

Training Details

Parameter Value
Base Model google/gemma-4-E2B-it
Model Parameters E2B
Training Method LoRA (Low-Rank Adaptation)
LoRA Rank 16
LoRA Alpha 32
Epochs 1
Batch Size None × 4 (effective)
Learning Rate 0.0002
Max Sequence Length 2048
Precision 4-bit quantization
Framework Unsloth + TRL

Training Results

Metric Value
Final Loss 1.8077
Training Time 3.55 hours
Steps 3295

Datasets

This model was trained on two high-quality reasoning datasets:

  1. gemini-3.1-pro-hard-high-reasoning (3,150 samples)
  2. gemini-3-pro-10000x-hard-high-reasoning (10,031 samples)

Total: 13,181 reasoning examples

Data Format

Each example contains:

  • Domain: e.g., "Analytic Number Theory", "Quantum Mechanics"
  • Concept: Specific topic within the domain
  • Difficulty: Primarily "Hardcore" level
  • Text: The problem or question
  • Model Thoughts: Detailed reasoning process (used in <think> tags)
  • Model Response: Final answer with explanation

Usage

With Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-E2B-it",
    torch_dtype="auto",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E2B-it")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "YOUR_USERNAME/gemma4-e2b-Gemini-3.1-Pro-Reasoning-Distill")

# Generate with thinking
messages = [{
    "role": "user",
    "content": "Prove that the square root of 2 is irrational."
}]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=2048, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

With Unsloth (Faster)

from unsloth import FastModel

model, tokenizer = FastModel.from_pretrained(
    "YOUR_USERNAME/gemma4-e2b-Gemini-3.1-Pro-Reasoning-Distill",
    max_seq_length=2048,
    load_in_4bit=True,
)

# Use the same generation code as above

Example Output

Input:

**Domain:** Number Theory
**Concept:** Proof by Contradiction

Prove that there are infinitely many prime numbers.

Output:

<think>
I need to prove there are infinitely many primes. The classic approach is Euclid's proof 
by contradiction...

Assume there are finitely many primes: p₁, p₂, ..., pₙ

Consider N = (p₁ × p₂ × ... × pₙ) + 1

N is either prime or composite. If prime, we found a new prime not in our list.
If composite, it must have a prime factor. But N leaves remainder 1 when divided 
by any pᵢ, so its prime factors aren't in our list.

Either way, contradiction! So there must be infinitely many primes.
</think>

**Proof (Euclid's Theorem):**

Suppose, for contradiction, that there are only finitely many primes: p₁, p₂, ..., pₙ.

Consider the number N = p₁p₂...pₙ + 1.

Since N > 1, either N is prime or N has a prime factorization. 

- If N is prime, then N is a prime not in our list (since N > pᵢ for all i), contradiction.
- If N is composite, let p be a prime factor of N. Then p must be one of p₁, ..., pₙ.
  But N ≡ 1 (mod pᵢ) for all i, so p cannot divide N, contradiction.

Therefore, there must be infinitely many primes. ∎

Limitations

  • May occasionally produce incorrect mathematical derivations
  • Performance varies by domain (strongest in pure math, logic)
  • Long responses may become repetitive
  • Should be verified for critical applications

License

This model inherits the Gemma license from the base model.

Citation

@misc{gemma4-Gemini-3.1-Pro-Reasoning-Distill-2026,
  title={Gemma-4-E2B Hard Reasoning},
  author={Fine-tuned with Unsloth},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/Ayodele01/gemma4-e2b-Gemini-3.1-Pro-Reasoning-Distill}
}

Acknowledgments

Downloads last month
52
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ayodele01/gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill

Adapter
(6)
this model

Datasets used to train Ayodele01/gemma-4-E2B-Gemini-3.1-Pro-Reasoning-Distill