2 10 51

Ceshine Lee

ceshine

https://blog.ceshine.net

ceshine_en
ceshine

AI & ML interests

None yet

Recent Activity

upvoted an article about 9 hours ago

MAD GRPO: Treating Dr. GRPO that tried to fix GRPO but brought instability and verbosity bias

liked a Space 27 days ago

victor/dlss-5-anything

liked a model 5 months ago

Photoroom/prx-1024-t2i-beta

View all activity

Organizations

Collections 1

spaces 1

T5 Paraphrasing

🦀

models 3

datasets 0

None public yet

Ceshine Lee

AI & ML interests

Recent Activity

Organizations

Collections 1

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Jamba: A Hybrid Transformer-Mamba Language Model

EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba

SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Jamba: A Hybrid Transformer-Mamba Language Model

EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba

SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series

spaces 1

T5 Paraphrasing

models 3

ceshine/t5-paraphrase-paws-msrp-opinosis

ceshine/t5-paraphrase-quora-paws

ceshine/TinyBERT_L-4_H-312_v2-distill-AllNLI

datasets 0

Ceshine Lee

AI & ML interests

Recent Activity

Organizations

Collections 1

spaces 1

T5 Paraphrasing

models 3 Sort: Recently updated

datasets 0

models 3