EnricoFermi commited on
Commit
abe8eb5
·
verified ·
1 Parent(s): 3ff567f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +144 -0
README.md ADDED
@@ -0,0 +1,144 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - 7b
4
+ - Chinese
5
+ - English
6
+ - android
7
+ - apple-silicon
8
+ - code
9
+ - compensation-lora
10
+ - continuum
11
+ - distillation
12
+ - edge-inference
13
+ - efficient
14
+ - embedded
15
+ - experiential-plasticity
16
+ - forge-alloy
17
+ - forged
18
+ - general
19
+ - general-purpose
20
+ - head-pruning
21
+ - iphone
22
+ - llama-cpp
23
+ - lm-studio
24
+ - local-inference
25
+ - lora
26
+ - macbook
27
+ - mobile
28
+ - neural-plasticity
29
+ - ollama
30
+ - on-device
31
+ - optimized
32
+ - pruned
33
+ - qwen
34
+ - qwen2.5
35
+ - raspberry-pi
36
+ - sentinel-ai
37
+ - text-generation
38
+ - validation-artifact
39
+ - versatile
40
+ base_model: Qwen/Qwen2.5-Coder-7B
41
+ pipeline_tag: text-generation
42
+ license: apache-2.0
43
+ ---
44
+
45
+ # 12% Pruned, 61.0 HUMANEVAL (base 62.2)
46
+
47
+ **Qwen2.5-Coder-7B** forged through Experiential Plasticity and recovered to within calibration tolerance of the unmodified base via KL-distillation compensation LoRA.
48
+
49
+ - **HUMANEVAL**: 61.0 (base 62.2, Δ -1.2)
50
+ - **HUMANEVAL+PLUS**: 53.0 (base 53.7, Δ -0.7)
51
+
52
+
53
+ <p align="center">
54
+ <a href="https://cambriantech.github.io/forge-alloy/verify/#011df8798a9e0b1d">
55
+ <img src="alloy-qr.png" alt="Verify Chain of Custody" width="160"/>
56
+ </a>
57
+ </p>
58
+
59
+ <p align="center">
60
+ <a href="https://cambriantech.github.io/forge-alloy/verify/#011df8798a9e0b1d"><b>Every claim on this card is verified</b></a><br>
61
+ <b>Trust: self-attested</b> · 2 benchmarks · 1 device tested<br>
62
+ <a href="https://github.com/CambrianTech/forge-alloy">ForgeAlloy</a> chain of custody · <a href="v2-7b-coder-compensated.alloy.json">Download alloy</a> · Merkle-chained
63
+ </p>
64
+
65
+ ---
66
+
67
+ ## Benchmarks
68
+
69
+ | Benchmark | Score | Base | Δ | Verified |
70
+ |---|---|---|---|---|
71
+ | **humaneval** | **61.0** | 62.2 | -1.2 | Self-reported |
72
+ | **humaneval_plus** | **53.0** | 53.7 | -0.7 | Self-reported |
73
+
74
+
75
+ ## What Changed (Base → Forged)
76
+
77
+ | | Base | Forged | Delta |
78
+ |---|---|---|---|
79
+ | **Pruning** | None | 12% heads (activation-magnitude) | **-12%** params ✅ |
80
+ | **LoRA** | None | rank=? | |
81
+ | **Pipeline** | | prune → lora → lora → eval | 1 cycles |
82
+
83
+ ## Runs On
84
+
85
+ | Device | Format | Size | Speed |
86
+ |--------|--------|------|-------|
87
+ | **NVIDIA GeForce RTX 5090** | fp16 | — | Verified |
88
+ | MacBook Pro 32GB | fp16 | 8.0GB | Expected |
89
+ | MacBook Air 16GB | Q8_0 | ~4.0GB | Expected |
90
+ | MacBook Air 8GB | Q4_K_M | ~2.5GB | Expected |
91
+ | iPhone / Android | Q4_K_M | ~2.5GB | Expected |
92
+
93
+ ## Quick Start
94
+
95
+ ```python
96
+ from transformers import AutoModelForCausalLM, AutoTokenizer
97
+
98
+ model = AutoModelForCausalLM.from_pretrained("continuum-ai/v2-7b-coder-compensated",
99
+ torch_dtype="auto", device_map="auto")
100
+ tokenizer = AutoTokenizer.from_pretrained("continuum-ai/v2-7b-coder-compensated")
101
+
102
+ inputs = tokenizer("def merge_sort(arr):", return_tensors="pt").to(model.device)
103
+ output = model.generate(**inputs, max_new_tokens=200)
104
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
105
+ ```
106
+
107
+ ## How It Was Made
108
+
109
+ ```
110
+ prune → lora → lora → eval (1 cycles)
111
+ ```
112
+
113
+ - **Pruning**: 12% heads via activation-magnitude
114
+ - **LoRA**: rank ?
115
+ - **LoRA**: rank ?
116
+ - **Hardware**: NVIDIA GeForce RTX 5090
117
+ - **Forge tool**: [Continuum](https://github.com/CambrianTech/continuum) Factory + [sentinel-ai](https://github.com/CambrianTech/sentinel-ai)
118
+
119
+ ## Chain of Custody
120
+
121
+ Scan the QR or [verify online](https://cambriantech.github.io/forge-alloy/verify/#011df8798a9e0b1d). Download the [alloy file](v2-7b-coder-compensated.alloy.json) to verify independently.
122
+
123
+ | What | Proof |
124
+ |------|-------|
125
+ | Forged on | NVIDIA GeForce RTX 5090, ? |
126
+ | Published | [huggingface](https://huggingface.co/continuum-ai/v2-7b-coder-compensated) — 2026-04-08T04:41:28.366728+00:00 |
127
+ | Trust level | [`self-attested`](https://github.com/CambrianTech/forge-alloy/blob/main/docs/ATTESTATION.md) |
128
+ | Spec | [ForgeAlloy](https://github.com/CambrianTech/forge-alloy) — Rust/Python/TypeScript |
129
+
130
+ ## Make Your Own
131
+
132
+ Forged with [Continuum](https://github.com/CambrianTech/continuum) — a distributed AI world that runs on your hardware.
133
+
134
+ <p align="center">
135
+ <a href="https://github.com/CambrianTech/continuum"><img src="https://raw.githubusercontent.com/CambrianTech/continuum/main/docs/images/factory.png" alt="Continuum Model Factory" width="400"/></a>
136
+ </p>
137
+
138
+ The Factory configurator lets you design and forge custom models visually — context extension, pruning, LoRA, quantization, vision/audio modalities. Pick your target devices, the system figures out what fits.
139
+
140
+ [GitHub](https://github.com/CambrianTech/continuum) · [All Models](https://huggingface.co/continuum-ai) · [Forge-Alloy](https://github.com/CambrianTech/forge-alloy)
141
+
142
+ ## License
143
+
144
+ apache-2.0