HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-Modal Context Interaction

Zhengrui Guo, Jiabo Ma, Yingxue Xu, Yihui Wang, Liansheng Wang, Hao Chen*

*Corresponding author for this work

Research output: Chapter in Book/Conference Proceeding/ReportConference Paper published in a bookpeer-review

14 Citations (Scopus)

Abstract

Histopathology serves as the gold standard in cancer diagnosis, with clinical reports being vital in interpreting and understanding this process, guiding cancer treatment and patient care. The automation of histopathology report generation with deep learning stands to significantly enhance clinical efficiency and lessen the labor-intensive, time-consuming burden on pathologists in report writing. In pursuit of this advancement, we introduce HistGen, a multiple instance learning-empowered framework for histopathology report generation together with the first benchmark dataset for evaluation. Inspired by diagnostic and report-writing workflows, HistGen features two delicately designed modules, aiming to boost report generation by aligning whole slide images (WSIs) and diagnostic reports at both local and global granularities. To achieve this, a local-global hierarchical encoder is developed for efficient visual feature aggregation from a region-to-slide perspective. Meanwhile, a cross-modal context module is proposed to explicitly facilitate alignment and interaction between distinct modalities, effectively bridging the gap between the extensive visual sequences of WSIs and corresponding highly summarized reports. Experimental results on WSI report generation show the proposed model outperforms state-of-the-art (SOTA) models by a large margin. Moreover, the results of fine-tuning our model on cancer subtyping and survival analysis tasks further demonstrate superior performance compared to SOTA methods, showcasing strong transfer learning capability. Dataset and code are available here.

Original languageEnglish
Title of host publicationMedical Image Computing and Computer Assisted Intervention – MICCAI 2024 - 27th International Conference, Proceedings
EditorsMarius George Linguraru, Qi Dou, Aasa Feragen, Stamatia Giannarou, Ben Glocker, Karim Lekadir, Julia A. Schnabel
PublisherSpringer Science and Business Media Deutschland GmbH
Pages189-199
Number of pages11
ISBN (Print)9783031720826
DOIs
Publication statusPublished - 2024
Externally publishedYes
Event27th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2024 - Marrakesh, Morocco
Duration: 6 Oct 202410 Oct 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15004 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2024
Country/TerritoryMorocco
CityMarrakesh
Period6/10/2410/10/24

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

Keywords

  • Cross-Modal Alignment
  • Histopathology Report Generation
  • Multiple Instance Learning

Fingerprint

Dive into the research topics of 'HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-Modal Context Interaction'. Together they form a unique fingerprint.

Cite this