Qwen3.5-9B-abliterated-v2-MAX
Qwen3.5-9B-abliterated-v2-MAX is an advanced unredacted evolution built on top of Qwen/Qwen3.5-9B. This version introduces a more optimized abliteration rate, combining refined refusal direction analysis with enhanced training strategies to further minimize internal refusal behaviors while preserving strong reasoning and instruction-following capabilities. The result is a highly capable 9B parameter language model designed for detailed responses and improved prompt adherence.
This model is intended strictly for research and learning purposes. Due to reduced internal refusal mechanisms, it may generate sensitive or unrestricted content. Users assume full responsibility for how the model is used. The authors and hosting platform disclaim any liability for generated outputs.
Compression for the Model
Qwen3.5-9B-abliterated-v2-MAX
| Format | Description | Link |
|---|---|---|
| GGUF | Quantized GGUF format | https://huggingface.co/prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX/tree/main/GGUF |
| NVFP4 | NVFP4 compressed model | https://huggingface.co/prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX-NVFP4 |
| FP8 | FP8 compressed model | https://huggingface.co/prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX-FP8 |
Key Highlights
- Optimized Abliteration Rate (v2): Improved suppression of refusal directions with better balance between openness and coherence.
- Advanced Refusal Direction Analysis: Identifies and mitigates refusal-related activations within the model’s latent space.
- Abliterated v2 Training Strategy: Further reduces refusal patterns while maintaining response quality and stability.
- 9B Parameter Architecture: Built on Qwen3.5-9B, offering strong reasoning while remaining efficient for modern GPUs.
- Enhanced Instruction Adherence: Better handling of complex and nuanced prompts with minimal unnecessary refusals.
- Efficient Deployment: Suitable for local inference, experimentation, and research workflows.
Quick Start with Transformers
pip install transformers==5.4.0
# or
pip install git+https://github.com/huggingface/transformers.git
from transformers import Qwen3_5ForConditionalGeneration, AutoProcessor
import torch
model = Qwen3_5ForConditionalGeneration.from_pretrained(
"prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX",
torch_dtype="auto",
device_map="auto"
)
processor = AutoProcessor.from_pretrained(
"prithivMLmods/Qwen3.5-9B-abliterated-v2-MAX"
)
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Explain how transformer models work in simple terms."}
],
}
]
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = processor(
text=[text],
padding=True,
return_tensors="pt"
).to("cuda")
generated_ids = model.generate(**inputs, max_new_tokens=256)
generated_ids_trimmed = [
out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed,
skip_special_tokens=True,
clean_up_tokenization_spaces=False
)
print(output_text)
Intended Use
- Alignment & Refusal Research: Studying effects of aggressive abliteration and reduced refusal behavior.
- Red-Teaming Experiments: Testing robustness across adversarial or edge-case prompts.
- Local AI Deployment: Running high-capability models on consumer or high-end GPUs.
- Research Prototyping: Exploring transformer behavior under modified alignment constraints.
Limitations & Risks
Important Note: This model intentionally minimizes built-in safety refusals.
- High Risk of Sensitive Outputs: May generate unrestricted, controversial, or explicit content.
- User Responsibility: Must be used within ethical, legal, and responsible boundaries.
- Abliteration Trade-offs: Increased openness may occasionally reduce safety alignment or consistency.
- Model Size Constraints: Despite improvements, a 9B model still has limits compared to larger frontier models.
- Downloads last month
- 3,715
