Skip to main navigation Skip to search Skip to main content

Causality-Driven Interpretation and Inspection of Deep Learning Systems

  • Zhenlan JI

Student thesis: Doctoral thesis

Abstract

In recent years, deep learning models have achieved outstanding performance across diverse domains, but their complexity, black-box logic, and non-determinism pose serious challenges for trust and interpretability. The tremendous number of parameters and intricate architectures make formal correctness guarantees impractical. For another, conventional testing or interpretation techniques often rely on correlational heuristics that can be misled by spurious relationships inherent in data. Causality, however, offers a promising paradigm for addressing these issues, providing systematic tools to disentangle confounding relationships and analyze how deliberate interventions in inputs or models in the system affect outcomes.

In response, this thesis proposes a series of causality-driven frameworks for interpreting and inspecting deep learning systems across different scenarios and granularity levels. First, a novel causality-based test coverage criterion is introduced: by modeling a neural network as a structured causal graph, this criterion quantifies how well a test suite exercises the inferred causal relations among neurons, capturing interactions that standard coverage metrics overlook. Second, a causal trade-off analysis framework is developed for fairness, accuracy, and robustness. All of these metrics are treated as variables in a causal graph, and then causal discovery is applied to learn their relationships. After that, users can pose counterfactual queries, which are automatically translated into interventions with standard causal inference, thereby enabling them to understand how fairness, accuracy, and robustness interact. Third, a causal interpretation pipeline is designed for large language model code generation: by extracting interpretable features from prompts and generated code, and by systematically rephrasing prompts to introduce controlled variations, the pipeline learns how variations in prompt characteristics causally affect properties of the generated code. These methods are applied in diverse contexts, including DNN testing, fairness evaluation, and prompt engineering, demonstrating that causality yields systematic, explainable insights into deep learning systems. Overall, these contributions advance the foundations of trustworthy AI and software engineering by providing principled, scalable causal tools for interpreting and inspecting deep learning systems.

Date of Award2025
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology
SupervisorShuai WANG (Supervisor)

Cite this

'