Develop an automated machine learning platform for residential property valuation

  • Lin DENG

Student thesis: Doctoral thesis

Abstract

Residential property valuation aims to predict the market value of residential property accurately and it is necessary for stakeholders involved in the real estate industry. Traditional property valuation models mainly use multiple linear regression, which cannot capture the nonlinear relationships between housing price determinants and property values. Machine learning (ML) models are therefore proposed for residential property valuation. However, there exists several limitations in using ML for residential property valuation: (1) a comprehensive ensemble learning framework involving bagging, boosting, stacking, and voting is still lacking; (2) effects of multi-source images on the residential property values are not well explored; (3) a widely applicable ML framework for residential property valuation and a comparison of housing determinants in multiple international financial centers still require further investigation; (4) a comprehensive automated machine learning (AutoML) framework integrating domain-specific and domain-agonistic function modules has not been proposed for residential property valuation. To address these limitations, this thesis aims to develop an AutoML platform for residential property valuation by integrating comprehensive ensemble learning frameworks, enabling multi-source unstructured data integration, and supporting property valuation in multiple regions. The AutoML platform for property valuation (AutoML4PV) incorporates domain-specific (i.e., data management, data preparation, and feature engineering) and domain-agnostic (i.e., model generation, model interpretation, and model deployment) functions. A user-friendly web-based Python Shiny application is developed, which allows different types of users to build their own ML models from the scratch easily. The system architecture of the AutoML4PV framework is proposed, including infrastructure layer, database layer, service layer, and presentation layer. Further improvements of the platform can be the integration of spatial analytic algorithms, more machine learning paradigms, generative artificial intelligence, and large language model.
Date of Award2024
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology
SupervisorXueqing ZHANG (Supervisor) & Zhe WANG (Supervisor)

Cite this

'