Abstract
Large Language Models (LLMs) have demonstrated remarkable abilities in reasoning and planning. Despite their success in various domains, such as mathematical problem-solving and coding, LLMs face challenges in ensuring reliable and optimal planning due to the inherent myopic nature of autoregressive decoding. This paper revisits LLM reasoning from an optimal control perspective, proposing a novel method, Predictive-Decoding, that leverages Model Predictive Control to enhance planning accuracy. By reweighting LLM distributions based on foresight trajectories, Predictive-Decoding aims to mitigate early errors and promote non-myopic planning. Our experiments show significant improvements across a wide range of tasks in math, coding, and agent-based scenarios. Furthermore, Predictive-Decoding demonstrates computational efficiency, outperforming search baselines while utilizing inference compute more effectively. This study provides insights into optimizing LLM planning capabilities. Code is available at this repo.
| Original language | English |
|---|---|
| Title of host publication | 13th International Conference on Learning Representations, ICLR 2025 |
| Publisher | International Conference on Learning Representations, ICLR |
| Pages | 65312-65339 |
| Number of pages | 28 |
| ISBN (Electronic) | 9798331320850 |
| Publication status | Published - 2025 |
| Event | 13th International Conference on Learning Representations, ICLR 2025 - Singapore, Singapore Duration: 24 Apr 2025 → 28 Apr 2025 |
Publication series
| Name | 13th International Conference on Learning Representations, ICLR 2025 |
|---|
Conference
| Conference | 13th International Conference on Learning Representations, ICLR 2025 |
|---|---|
| Country/Territory | Singapore |
| City | Singapore |
| Period | 24/04/25 → 28/04/25 |
Bibliographical note
Publisher Copyright:© 2025 13th International Conference on Learning Representations, ICLR 2025. All rights reserved.
Fingerprint
Dive into the research topics of 'NON-MYOPIC GENERATION OF LANGUAGE MODELS FOR REASONING AND PLANNING'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver