NON-MYOPIC GENERATION OF LANGUAGE MODELS FOR REASONING AND PLANNING

Chang Ma, Haiteng Zhao, Junlei Zhang, Junxian He, Lingpeng Kong

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

Abstract

Large Language Models (LLMs) have demonstrated remarkable abilities in reasoning and planning. Despite their success in various domains, such as mathematical problem-solving and coding, LLMs face challenges in ensuring reliable and optimal planning due to the inherent myopic nature of autoregressive decoding. This paper revisits LLM reasoning from an optimal control perspective, proposing a novel method, Predictive-Decoding, that leverages Model Predictive Control to enhance planning accuracy. By reweighting LLM distributions based on foresight trajectories, Predictive-Decoding aims to mitigate early errors and promote non-myopic planning. Our experiments show significant improvements across a wide range of tasks in math, coding, and agent-based scenarios. Furthermore, Predictive-Decoding demonstrates computational efficiency, outperforming search baselines while utilizing inference compute more effectively. This study provides insights into optimizing LLM planning capabilities. Code is available at this repo.

Original languageEnglish
Title of host publication13th International Conference on Learning Representations, ICLR 2025
PublisherInternational Conference on Learning Representations, ICLR
Pages65312-65339
Number of pages28
ISBN (Electronic)9798331320850
Publication statusPublished - 2025
Event13th International Conference on Learning Representations, ICLR 2025 - Singapore, Singapore
Duration: 24 Apr 202528 Apr 2025

Publication series

Name13th International Conference on Learning Representations, ICLR 2025

Conference

Conference13th International Conference on Learning Representations, ICLR 2025
Country/TerritorySingapore
CitySingapore
Period24/04/2528/04/25

Bibliographical note

Publisher Copyright:
© 2025 13th International Conference on Learning Representations, ICLR 2025. All rights reserved.

Fingerprint

Dive into the research topics of 'NON-MYOPIC GENERATION OF LANGUAGE MODELS FOR REASONING AND PLANNING'. Together they form a unique fingerprint.

Cite this