2 96 35

xiang huang

xianghuang

AI & ML interests

None yet

Recent Activity

liked a Space 1 day ago

HuggingFaceTB/smol-training-playbook

upvoted a paper 9 days ago

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

upvoted a paper 9 days ago

Experiential Reinforcement Learning

View all activity

Organizations

None yet

liked a Space 1 day ago

The Smol Training Playbook

📚

3.11k

The secrets to building world-class LLMs

upvoted 3 papers 9 days ago

upvoted a paper about 1 month ago

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 151

liked a model 2 months ago

zai-org/GLM-5

Text Generation • 754B • Updated 13 days ago • 491k • • 2.07k

liked 4 datasets 2 months ago

HuggingFaceFW/fineweb-2

Viewer • Updated Oct 27, 2025 • 4.48B • 93.6k • 784

allenai/dolma

Updated Apr 17, 2024 • 2.89k • 1.02k

HuggingFaceFW/fineweb-edu

Viewer • Updated Jul 11, 2025 • 3.5B • 367k • 1.03k

HuggingFaceFW/fineweb

Viewer • Updated Jul 11, 2025 • 52.5B • 631k • 2.76k

liked a dataset 3 months ago

svjack/Genshin-Impact-Plot-Summary

Viewer • Updated Mar 14, 2025 • 1.11k • 12 • 2

liked a dataset 4 months ago

MichiganNLP/Chumor

Viewer • Updated Dec 24, 2024 • 3.34k • 30 • 8

upvoted a paper 4 months ago

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents

Paper • 2507.03112 • Published Jul 3, 2025 • 34

published a dataset 5 months ago

xianghuang/Sysbank

Updated Nov 17, 2025 • 5

upvoted 6 papers 7 months ago

ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

Paper • 2505.23923 • Published May 29, 2025 • 8

Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning

Paper • 2505.17813 • Published May 23, 2025 • 58

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

Paper • 2505.11896 • Published May 17, 2025 • 58

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Paper • 2504.21776 • Published Apr 30, 2025 • 59

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20, 2025 • 62

ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Paper • 2505.04588 • Published May 7, 2025 • 65

xiang huang

AI & ML interests

Recent Activity

Organizations

xianghuang's activity

The Smol Training Playbook