-
Attention Is All You Need
Paper • 1706.03762 • Published • 120 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2505.03335
-
Dr. Zero: Self-Evolving Search Agents without Training Data
Paper • 2601.07055 • Published • 22 -
Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models
Paper • 2503.04813 • Published • 2 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191
-
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191 -
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
Paper • 2502.06781 • Published • 58 -
internlm/OREAL-7B
Text Generation • 8B • Updated • 19 • • 20 -
internlm/OREAL-32B
Text Generation • Updated • 53 • 24
-
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191 -
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Paper • 2408.06195 • Published • 73 -
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
Paper • 2508.14029 • Published • 119 -
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play
Paper • 2509.25541 • Published • 141
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Neural Machine Translation by Jointly Learning to Align and Translate
Paper • 1409.0473 • Published • 7 -
Attention Is All You Need
Paper • 1706.03762 • Published • 120 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
Hierarchical Reasoning Model
Paper • 2506.21734 • Published • 50
-
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 60 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 53 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 190
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 513 -
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper • 2510.07499 • Published • 49 -
Improving Context Fidelity via Native Retrieval-Augmented Reasoning
Paper • 2509.13683 • Published • 8 -
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering
Paper • 2509.00798 • Published • 1
-
Attention Is All You Need
Paper • 1706.03762 • Published • 120 -
Scaling Laws for Neural Language Models
Paper • 2001.08361 • Published • 10 -
Training Compute-Optimal Large Language Models
Paper • 2203.15556 • Published • 11 -
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
Paper • 2210.04186 • Published
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Dr. Zero: Self-Evolving Search Agents without Training Data
Paper • 2601.07055 • Published • 22 -
Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models
Paper • 2503.04813 • Published • 2 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191
-
Neural Machine Translation by Jointly Learning to Align and Translate
Paper • 1409.0473 • Published • 7 -
Attention Is All You Need
Paper • 1706.03762 • Published • 120 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 26 -
Hierarchical Reasoning Model
Paper • 2506.21734 • Published • 50
-
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191 -
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
Paper • 2502.06781 • Published • 58 -
internlm/OREAL-7B
Text Generation • 8B • Updated • 19 • • 20 -
internlm/OREAL-32B
Text Generation • Updated • 53 • 24
-
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper • 2403.10704 • Published • 60 -
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 53 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191 -
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 190
-
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191 -
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers
Paper • 2408.06195 • Published • 73 -
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR
Paper • 2508.14029 • Published • 119 -
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play
Paper • 2509.25541 • Published • 141
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 513 -
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper • 2510.07499 • Published • 49 -
Improving Context Fidelity via Native Retrieval-Augmented Reasoning
Paper • 2509.13683 • Published • 8 -
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering
Paper • 2509.00798 • Published • 1