Scalable path-sensitive data-dependence analysis in the presence of aliasing

  • Peisen YAO

Student thesis: Master's thesis

Abstract

This thesis addresses the challenge of precise and scalable data-dependence analysis, which tracks the definition-usage information in a program. Such information can be used for detecting a broad category of software properties, such as memory safety (e.g., null dereference), resource usage (e.g., memory leak), and security properties (e.g., the use of tainted data). However, owing to the innate undecidability of program analysis, there is still little progress on path-sensitive data-dependence analysis that can scale to millions of lines of code. We present a path- and context-sensitive data-dependence analysis that significantly improves upon the state-of-the-art. Our approach decomposes a path-sensitive data-dependence analysis into 1) a lightweight but precision-preserving quasi-path-sensitive and context-sensitive alias analysis that constructs guarded data-dependence graphs, and 2) a demand-driven phase where we resolve full path- and context-sensitive data-dependence relations via graph reachability queries. We have implemented the proposed technique as a tool called Falcon. Using a suite of sixteen open-source C/C++ programs, which range in size from 13 KLoC to 7.99 MLoC, we compare our techniques against the state-of-the-art approaches for data-dependence graph construction and path-sensitive data-dependence analysis. The experimental results show that Falcon gracefully scales to large codebase and is useful for bug hunting. Our work has made path-sensitive data-dependence analysis a reasonable choice for applications with millions of lines of code.
Date of Award2019
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology

Cite this

'