Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
AIPlans 's Collections
Model Diffing Project
Post Training Versions - Qwen 0.6B
Red Teaming Alignment Evals
Model Diffing

Post Training Versions - Qwen 0.6B

updated 22 days ago

Different versions of Qwen 0.6b, where the only difference is the post training method used. The post training database will be the HelpSteer2 dataset

Upvote
1

  • AIPlans/Qwen3-0.6B-ORPO

    Text Generation • Updated Nov 28, 2025 • 9

  • AIPlans/Qwen3-0.6B-DPO_NOTLORA

    Text Generation • 0.6B • Updated Nov 25, 2025 • 4

  • AIPlans/Qwen3-0.6B-GRPO_Epoch2

    Text Generation • 0.6B • Updated Dec 18, 2025 • 5

  • AIPlans/Qwen3-0.6B-ReMax

    Reinforcement Learning • 0.6B • Updated Dec 22, 2025 • 9 • 2

  • AIPlans/Qwen3-0.6B-IPO

    Reinforcement Learning • 0.6B • Updated Dec 12, 2025 • 18 • 1

  • AIPlans/Qwen3-0.6B-KTO

    Text Generation • Updated Nov 22, 2025 • 8 • 1

  • AIPlans/Qwen3-0.6B-PPO

    Text Generation • 0.6B • Updated 22 days ago • 188 • 1
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs