🚀 Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference
-
Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference
Paper • 2604.07394 • Published • 12 -
QQTang1223/full_streaming_Llama-3.1-8B-Instruct
Text Generation • 8B • Updated • 279 -
QQTang1223/full_xattn_Qwen3-8B
Text Generation • 8B • Updated • 280 • 1 -
QQTang1223/full_xattn_Qwen3-4B
Text Generation • 4B • Updated • 276