Qwen3.5 4B GGUF (Quantized)

This repository provides a GGUF quantized version of the original Qwen3.5 4B model, optimized for efficient local inference using tools like llama.cpp, LM Studio, and similar runtimes.


πŸ”— Base Model

This model is derived from the official:

πŸ‘‰ https://huggingface.co/Qwen/Qwen3.5-4B (by Alibaba / Qwen Team)

Please refer to the original model for full details, training methodology, benchmarks, and licensing terms.


βš™οΈ Quantization Details

  • Format: GGUF
  • Quantization: Q4_K_XL
  • Size: ~2.9 GB
  • Architecture: Qwen3.5

This version is designed to balance performance and memory efficiency, making it suitable for local deployments.


πŸ“¦ Quantization Source

This GGUF file is sourced from:

πŸ‘‰ https://huggingface.co/unsloth/Qwen3.5-4B-GGUF

Specifically:

  • Qwen3.5-4B-UD-Q4_K_XL.gguf

All credit for quantization goes to the original uploader (Unsloth).


πŸš€ Usage

You can run this model locally using:

llama.cpp

./main -m qwen3.5-4b-q4_k_xl.gguf -p "Explain SQL injection"

Other tools

  • LM Studio
  • KoboldCpp
  • Ollama

πŸ’‘ Example Use Cases

  • General-purpose chat
  • Coding assistance
  • Technical explanations
  • Integration into custom AI systems (e.g., agents, tools)

πŸ§ͺ Tested With

  • Local inference (CPU/GPU hybrid)
  • Integration with external tools (web search, reasoning pipelines)

⚠️ Disclaimer

  • This is not an original model.
  • Behavior and capabilities are inherited from the base Qwen3.5 model.

πŸ“œ License

  • Please follow the license of the original Qwen model.

πŸ™Œ Acknowledgements

  • Qwen Team (Alibaba) β€” Base model
  • Unsloth β€” GGUF quantization
  • llama.cpp β€” GGUF runtime support

🌐 Related Project

This model is used in:

πŸ‘‰ CyberGuard AI (Cybersecurity assistant system)

  • Hosted on huggingface spaces

Downloads last month
65
GGUF
Model size
4B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Soham308/CyberGuard-Model

Finetuned
Qwen/Qwen3.5-4B
Quantized
(140)
this model