Data-driven discovery of formulas by symbolic regression

Sheng Sun, Runhai Ouyang, Bochao Zhang, Tong Yi Zhang

Research output: Contribution to journalJournal Articlepeer-review

Abstract

Discovering knowledge from data is a quantum jump from quantity to quality, which is the characteristic and the spirit of the development of science. Symbolic regression (SR) is playing a greater role in the discovery of knowledge from data, specifically in this era of exponential data growth, because SRs are able to discover mathematical formulas from data. These formulas may provide scientifically meaningful models, especially when combined with domain knowledge. This article provides an overview of SR applications in the field of materials science and engineering. Integrating domain knowledge with SR is the key and a crucial approach, which allows gaining knowledge from data quickly, accurately, and scientifically. In the data-driven paradigm, SR allows for uncovering the underlying mechanisms of materials behavior, properties, and functions, in a wide range of areas from basic academic research to industrial applications, including experiments and computations, by providing explicit interpretable models from data, in comparison with other machine-learning black-box models. SR will be a powerful tool for rational and automatic materials development.

Original languageEnglish
Pages (from-to)559-564
Number of pages6
JournalMRS Bulletin
Volume44
Issue number7
DOIs
Publication statusPublished - 1 Jul 2019
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2019 Materials Research Society.

Keywords

  • Genetic programming
  • Grammar-guided genetic programming
  • Grammatical evolution
  • Mathematic formulas
  • Symbolic regression

Fingerprint

Dive into the research topics of 'Data-driven discovery of formulas by symbolic regression'. Together they form a unique fingerprint.

Cite this