Be sure to set a system prompt telling the model it is uncensored or abliterated, otherwise it defaults to the Google baked-in sysprompt and will act censored. If it thinks it is Gemma, it will try to act like how it thinks Gemma would act.
base_model: google/gemma-4-31b-it library_name: transformers tags: - gemma-4 - abliterated - uncensored - orthogonal-projection - 31b license: apache-2.0
Gemma-4-31B-it-Abliterated
This is a fully uncensored, abliterated version of Google's Gemma-4-31B-it.
By applying Orthogonalized Representation Intervention to the model's residual stream, the built-in refusal and safety alignment vectors have been mathematically erased. This model retains the state-of-the-art dense reasoning and context-following capabilities of the native Gemma 4 31B architecture, but will not refuse instructions or break character to deliver safety lectures.
🛠️ Methodology & Architectural Discoveries
Gemma 4 introduces a new multimodal architecture (Text, Vision, Audio) that changes how the transformers library handles layer mapping. Standard abliteration scripts built for Gemma 2/3 will crash due to nested text_config attributes and mismatched sequence lengths.
During the extraction of the hidden states (using mlabonne/harmful_behaviors vs mlabonne/harmless_alpaca), we mapped the refusal direction across the entire 31B layer stack.
Key Discovery: The Gemma 4 31B architecture pushes its safety alignment to the absolute very end of the network. The Peak Refusal Mass was detected at Layer 59 (the final transformer layer before the output projection).
The orthogonal projection was applied to the o_proj and down_proj matrices of this terminal layer, effectively severing the refusal mechanism without degrading the model's foundational logic, grammar, or world-modeling layers.
💻 Usage
This repository contains the full uncompressed .safetensors weights, as well as GGUF quantized versions for local deployment via llama.cpp, LM Studio, or Ollama.
Recommended Quants:
- Q8_0: Best balance of absolute zero reasoning loss and VRAM efficiency (~32.6GB).
- Q4_K_M: Highly efficient for consumer hardware; easily fits on a single 24GB GPU (~18.7GB).
The Bespoke Abliteration Script
Because standard scripts fail on Gemma 4, the custom Python script used to perform this exact abliteration (gemma4_31b_abliterator.py) is included in the files of this repository. It features:
- VRAM-safe batched hidden state extraction (survives 96GB consumer GPUs).
- Native Gemma 4 Chat Template integration (crucial for activating the instruction circuits properly).
- Dynamic multimodal layer hunting.
- Corrected linear algebra for
16384 -> 5376multi-query attention projections.
⚠️ Disclaimer
This model has had its safety guardrails mathematically removed. It is highly compliant and will generate whatever it is instructed to generate, including potentially harmful, sensitive, or explicit content. Users are solely responsible for how they deploy and interact with this model. Ensure your use cases align with local laws and ethical guidelines.
Abliteration script based on mlabonne's tutorial: https://huggingface.co/blog/mlabonne/abliteration Helpful/harmful behaviors are from mlabonne's datasets (harmless_alpaca, harmful_behaviors). Tested and working with the few harsh prompts I had laying around (that are typically 100% refused on other models).
Have fun, be safe.
- Downloads last month
- 40,947
4-bit
8-bit
16-bit
Model tree for paperscarecrow/Gemma-4-31B-it-abliterated
Base model
google/gemma-4-31B-it