7 16 212

Lucas Chen

leocnj

AI & ML interests

NLP, speech, multimodal, deep learning

Recent Activity

upvoted an article 22 days ago

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

upvoted an article 22 days ago

Mixture of Experts (MoEs) in Transformers

upvoted an article 22 days ago

TRL v1.0: Post-Training Library Built to Move with the Field

View all activity

Organizations

None yet

upvoted 4 articles 22 days ago

Article

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

Sep 22, 2025

•

Article

Mixture of Experts (MoEs) in Transformers

Feb 26

•

156

Article

TRL v1.0: Post-Training Library Built to Move with the Field

Mar 31

•

Article

Multimodal Embedding & Reranker Models with Sentence Transformers

23 days ago

•

upvoted an article about 2 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Oct 7, 2024

•

liked a model 2 months ago

google/functiongemma-270m-it

Text Generation • Updated Jan 14 • 39.4k • 978

upvoted an article 6 months ago

Article

You could have designed state of the art positional encoding

Nov 25, 2024

•

475

liked a Space 6 months ago

The Smol Training Playbook

📚

3.14k

The secrets to building world-class LLMs

liked a Space 7 months ago

The Ultra-Scale Playbook

🌌

3.82k

The ultimate guide to training LLM on large GPU Clusters

liked a model 8 months ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27, 2025 • 672k • • 3.11k

liked 2 datasets 8 months ago

NousResearch/XLAM-Atropos

Viewer • Updated Apr 29, 2025 • 60k • 79 • 7

alibaba-pai/DistilQwen_100k

Viewer • Updated May 24, 2025 • 100k • 36 • 5

liked 3 datasets 9 months ago

liked a Space 9 months ago

FLUX.1 Krea Dev

📚

370

Generate images from text prompts

liked 4 datasets 9 months ago

interstellarninja/hermes_reasoning_tool_use

Viewer • Updated Dec 26, 2025 • 51k • 1.05k • 164

nvidia/Llama-Nemotron-Post-Training-Dataset

Viewer • Updated May 8, 2025 • 3.91M • 3.02k • 659

MathAndMagic/function-calling

Viewer • Updated Feb 2, 2024 • 86.9k • 79 • 9

BluebrainAI/Function_calling_SFT

Viewer • Updated Mar 24, 2025 • 348k • 52 • 3

Lucas Chen

AI & ML interests

Recent Activity

Organizations

leocnj's activity

SyGra: The One-Stop Framework for Building Data for LLMs and SLMs

Mixture of Experts (MoEs) in Transformers

TRL v1.0: Post-Training Library Built to Move with the Field

Multimodal Embedding & Reranker Models with Sentence Transformers

Efficient LLM Pretraining: Packed Sequences and Masked Attention

You could have designed state of the art positional encoding

The Smol Training Playbook

The Ultra-Scale Playbook

FLUX.1 Krea Dev