continuum-ai
/

qwen2.5-coder-7b-compacted

+---
+tags:
+- 7b
+- Chinese
+- English
+- android
+- apple-silicon
+- code
+- compensation-lora
+- continuum
+- distillation
+- edge-inference
+- efficient
+- embedded
+- experiential-plasticity
+- forge-alloy
+- forged
+- general
+- general-purpose
+- head-pruning
+- iphone
+- llama-cpp
+- lm-studio
+- local-inference
+- lora
+- macbook
+- mobile
+- neural-plasticity
+- ollama
+- on-device
+- optimized
+- pruned
+- qwen
+- qwen2.5
+- raspberry-pi
+- sentinel-ai
+- text-generation
+- validation-artifact
+- versatile
+base_model: Qwen/Qwen2.5-Coder-7B
+pipeline_tag: text-generation
+license: apache-2.0
+---
+# 12% Pruned, 61.0 HUMANEVAL (base 62.2)
+**Qwen2.5-Coder-7B** forged through Experiential Plasticity and recovered to within calibration tolerance of the unmodified base via KL-distillation compensation LoRA.
+- **HUMANEVAL**: 61.0 (base 62.2, Δ -1.2)
+- **HUMANEVAL+PLUS**: 53.0 (base 53.7, Δ -0.7)
+<p align="center">
+<a href="https://cambriantech.github.io/forge-alloy/verify/#011df8798a9e0b1d">
+<img src="alloy-qr.png" alt="Verify Chain of Custody" width="160"/>
+</a>
+</p>
+<p align="center">
+<a href="https://cambriantech.github.io/forge-alloy/verify/#011df8798a9e0b1d"><b>Every claim on this card is verified</b></a><br>
+<b>Trust: self-attested</b> · 2 benchmarks · 1 device tested<br>
+<a href="https://github.com/CambrianTech/forge-alloy">ForgeAlloy</a> chain of custody · <a href="v2-7b-coder-compensated.alloy.json">Download alloy</a> · Merkle-chained
+</p>
+---
+## Benchmarks
+| Benchmark | Score | Base | Δ | Verified |
+|---|---|---|---|---|
+| **humaneval** | **61.0** | 62.2 | -1.2 | Self-reported |
+| **humaneval_plus** | **53.0** | 53.7 | -0.7 | Self-reported |
+## What Changed (Base → Forged)
+| | Base | Forged | Delta |
+|---|---|---|---|
+| **Pruning** | None | 12% heads (activation-magnitude) | **-12%** params ✅ |
+| **LoRA** | None | rank=? |  |
+| **Pipeline** | | prune → lora → lora → eval | 1 cycles |
+## Runs On
+| Device | Format | Size | Speed |
+|--------|--------|------|-------|
+| **NVIDIA GeForce RTX 5090** | fp16 | — | Verified |
+| MacBook Pro 32GB | fp16 | 8.0GB | Expected |
+| MacBook Air 16GB | Q8_0 | ~4.0GB | Expected |
+| MacBook Air 8GB | Q4_K_M | ~2.5GB | Expected |
+| iPhone / Android | Q4_K_M | ~2.5GB | Expected |
+## Quick Start
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("continuum-ai/v2-7b-coder-compensated",
+    torch_dtype="auto", device_map="auto")
+tokenizer = AutoTokenizer.from_pretrained("continuum-ai/v2-7b-coder-compensated")
+inputs = tokenizer("def merge_sort(arr):", return_tensors="pt").to(model.device)
+output = model.generate(**inputs, max_new_tokens=200)
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+## How It Was Made
+```
+prune → lora → lora → eval (1 cycles)
+```
+- **Pruning**: 12% heads via activation-magnitude
+- **LoRA**: rank ?
+- **LoRA**: rank ?
+- **Hardware**: NVIDIA GeForce RTX 5090
+- **Forge tool**: [Continuum](https://github.com/CambrianTech/continuum) Factory + [sentinel-ai](https://github.com/CambrianTech/sentinel-ai)
+## Chain of Custody
+Scan the QR or [verify online](https://cambriantech.github.io/forge-alloy/verify/#011df8798a9e0b1d). Download the [alloy file](v2-7b-coder-compensated.alloy.json) to verify independently.
+| What | Proof |
+|------|-------|
+| Forged on | NVIDIA GeForce RTX 5090, ? |
+| Published | [huggingface](https://huggingface.co/continuum-ai/v2-7b-coder-compensated) — 2026-04-08T04:41:28.366728+00:00 |
+| Trust level | [`self-attested`](https://github.com/CambrianTech/forge-alloy/blob/main/docs/ATTESTATION.md) |
+| Spec | [ForgeAlloy](https://github.com/CambrianTech/forge-alloy) — Rust/Python/TypeScript |
+## Make Your Own
+Forged with [Continuum](https://github.com/CambrianTech/continuum) — a distributed AI world that runs on your hardware.
+<p align="center">
+<a href="https://github.com/CambrianTech/continuum"><img src="https://raw.githubusercontent.com/CambrianTech/continuum/main/docs/images/factory.png" alt="Continuum Model Factory" width="400"/></a>
+</p>
+The Factory configurator lets you design and forge custom models visually — context extension, pruning, LoRA, quantization, vision/audio modalities. Pick your target devices, the system figures out what fits.
+[GitHub](https://github.com/CambrianTech/continuum) · [All Models](https://huggingface.co/continuum-ai) · [Forge-Alloy](https://github.com/CambrianTech/forge-alloy)
+## License
+apache-2.0