20250903 - a ShiqiangWoo Collection

ShiqiangWoo 's Collections

AI-generaed code

20250903

updated Sep 4, 2025

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 238
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2, 2025 • 84
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion

Paper • 2509.01215 • Published Sep 1, 2025 • 51
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31, 2025 • 85
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2, 2025 • 127
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1, 2025 • 80
Baichuan-M2: Scaling Medical Capability with Large Verifier System

Paper • 2509.02208 • Published Sep 2, 2025 • 43
Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR

Paper • 2509.02522 • Published Sep 2, 2025 • 25
Kwai Keye-VL 1.5 Technical Report

Paper • 2509.01563 • Published Sep 1, 2025 • 38
Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

Paper • 2509.01363 • Published Sep 1, 2025 • 61
Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2, 2025 • 25
GenCompositor: Generative Video Compositing with Diffusion Transformer

Paper • 2509.02460 • Published Sep 2, 2025 • 26
OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

Paper • 2509.01644 • Published Sep 1, 2025 • 34
Attributes as Textual Genes: Leveraging LLMs as Genetic Algorithm Simulators for Conditional Synthetic Data Generation

Paper • 2509.02040 • Published Sep 2, 2025 • 15
M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision

Paper • 2509.01360 • Published Sep 1, 2025 • 12
FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games

Paper • 2509.01052 • Published Sep 1, 2025 • 22
Universal Deep Research: Bring Your Own Model and Strategy

Paper • 2509.00244 • Published Aug 29, 2025 • 14
Discrete Noise Inversion for Next-scale Autoregressive Text-based Image Editing

Paper • 2509.01984 • Published Sep 2, 2025 • 7
Fantastic Pretraining Optimizers and Where to Find Them

Paper • 2509.02046 • Published Sep 2, 2025 • 14
MedDINOv3: How to adapt vision foundation models for medical image segmentation?

Paper • 2509.02379 • Published Sep 2, 2025 • 2
Improving Large Vision and Language Models by Learning from a Panel of Peers

Paper • 2509.01610 • Published Sep 1, 2025 • 3
Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views

Paper • 2509.01250 • Published Sep 1, 2025 • 2
SQL-of-Thought: Multi-agentic Text-to-SQL with Guided Error Correction

Paper • 2509.00581 • Published Aug 30, 2025 • 11
C-DiffDet+: Fusing Global Scene Context with Generative Denoising for High-Fidelity Object Detection

Paper • 2509.00578 • Published Aug 30, 2025 • 2
Metis: Training Large Language Models with Advanced Low-Bit Quantization

Paper • 2509.00404 • Published Aug 30, 2025 • 7
FastFit: Accelerating Multi-Reference Virtual Try-On via Cacheable Diffusion Models

Paper • 2508.20586 • Published Aug 28, 2025 • 4