Model weights

#1
by dog3-l0ver - opened

Hello! Will the model weights be provided?

Also on a sidenote: What hardware was used for training? The model card says "full parameter fine-tuning" which would be crazy given timteh673 did only 66.8M / 122.1B (0.05%) across 8Γ— NVIDIA H200 SXM5.

This comment has been hidden

Hello! Will the model weights be provided?

Also on a sidenote: What hardware was used for training? The model card says "full parameter fine-tuning" which would be crazy given timteh673 did only 66.8M / 122.1B (0.05%) across 8Γ— NVIDIA H200 SXM5.

Thanks for your question!

We do plan to release the model weights β€” they will be uploaded shortly.

Regarding training, the model was trained on 8Γ— NVIDIA H100 80GB (HBM3) GPUs.

Thank you for your reply! I do hope this one proves good. Can't wait to run it through my cursed pipeline and squeeze it into my 64GB VRAM haha.

Also sorry if I came of as pushy in my initial message. Sometimes HF repos are a mess tbh and wanted to make sure I have something fun to look forward to!

Any fun info about this model you would be willing to share right now? Qwen3.5's biggest problem imho is it's reasoning so kinda hyped for this distill!

dog3-l0ver changed discussion status to closed

Sign up or log in to comment