BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding Paper • 2503.21483 • Published Mar 27, 2025 • 1
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents Paper • 2407.01511 • Published Jul 1, 2024 • 1
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents Paper • 2407.01511 • Published Jul 1, 2024 • 1