Qwen3.5-9B-abliterated-v2-MAX

Qwen3.5-9B-abliterated-v2-MAX is an advanced unredacted evolution built on top of Qwen/Qwen3.5-9B. This version introduces a more optimized abliteration rate, combining refined refusal direction analysis with enhanced training strategies to further minimize internal refusal behaviors while preserving strong reasoning and instruction-following capabilities. The result is a highly capable 9B parameter language model designed for detailed responses and improved prompt adherence.

This model is intended strictly for research and learning purposes. Due to reduced internal refusal mechanisms, it may generate sensitive or unrestricted content. Users assume full responsibility for how the model is used. The authors and hosting platform disclaim any liability for generated outputs.

Compression for the Model

Qwen3.5-9B-abliterated-v2-MAX

Format	Description	Link
GGUF	Quantized GGUF format	https://huggingface.co/prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX/tree/main/GGUF
NVFP4	NVFP4 compressed model	https://huggingface.co/prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX-NVFP4
FP8	FP8 compressed model	https://huggingface.co/prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX-FP8

Key Highlights

Optimized Abliteration Rate (v2): Improved suppression of refusal directions with better balance between openness and coherence.
Advanced Refusal Direction Analysis: Identifies and mitigates refusal-related activations within the model’s latent space.
Abliterated v2 Training Strategy: Further reduces refusal patterns while maintaining response quality and stability.
9B Parameter Architecture: Built on Qwen3.5-9B, offering strong reasoning while remaining efficient for modern GPUs.
Enhanced Instruction Adherence: Better handling of complex and nuanced prompts with minimal unnecessary refusals.
Efficient Deployment: Suitable for local inference, experimentation, and research workflows.

Quick Start with Transformers

pip install transformers==5.4.0
# or
pip install git+https://github.com/huggingface/transformers.git

from transformers import Qwen3_5ForConditionalGeneration, AutoProcessor
import torch

model = Qwen3_5ForConditionalGeneration.from_pretrained(
    "prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX",
    torch_dtype="auto",
    device_map="auto"
)

processor = AutoProcessor.from_pretrained(
    "prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX"
)

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "Explain how transformer models work in simple terms."}
        ],
    }
]

text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)

inputs = processor(
    text=[text],
    padding=True,
    return_tensors="pt"
).to("cuda")

generated_ids = model.generate(**inputs, max_new_tokens=256)

generated_ids_trimmed = [
    out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]

output_text = processor.batch_decode(
    generated_ids_trimmed,
    skip_special_tokens=True,
    clean_up_tokenization_spaces=False
)

print(output_text)

Intended Use

Alignment & Refusal Research: Studying effects of aggressive abliteration and reduced refusal behavior.
Red-Teaming Experiments: Testing robustness across adversarial or edge-case prompts.
Local AI Deployment: Running high-capability models on consumer or high-end GPUs.
Research Prototyping: Exploring transformer behavior under modified alignment constraints.

Limitations & Risks

Important Note: This model intentionally minimizes built-in safety refusals.

High Risk of Sensitive Outputs: May generate unrestricted, controversial, or explicit content.
User Responsibility: Must be used within ethical, legal, and responsible boundaries.
Abliteration Trade-offs: Increased openness may occasionally reduce safety alignment or consistency.
Model Size Constraints: Despite improvements, a 9B model still has limits compared to larger frontier models.