A comparative genomic analysis of the ecological characters and evolutionary relationship between Escherichia coli and cryptic Escherichia clades

  • Zhiyong SHEN

Student thesis: Doctoral thesis

Abstract

Escherichia coli is widely used as one of the major fecal indicator bacteria for water quality monitoring due to its feature of being adapted to the gastrointestinal tracts of warm-blooded animals (primary habitats) and considered unable to survive for extended time in external environments (secondary habitats). However, recent studies indicated the existence of six monophyletic cryptic Escherichia clades (CI to VI) indistinguishable from E. coli sensu stricto using conventional diagnostic methods. CI is a subspecies of E. coli and host-associated, and CIII to V are associated with external environments. Contrarily, the evolution, habitat, and lifestyle of the members of CII were rarely investigated. This research investigated the evolution and ecology of the CII strains and their relationship with other Escherichia, especially E. coli through genomic approaches. Preliminary indications of ecological differentiation within CII were also investigated through a combination of delta-bitscore metrics and random forest classifier. Comparative genomics between 18 CII strains isolated from marine and freshwater environments in Hong Kong and 42 reference strains revealed Escherichia’s genome plasticity, with gene loss predominated as they evolve and differentiate. Generally, accelerated gene loss related to cell adhesion in CII-VI drives their divergence from enteric Escherichia, reflected an inclination towards extra-host lifestyle. Moreover, enteric and CII-VI strains have genetic and functional enrichments favoring survival in gastrointestinal tract and external environments, respectively. Homologous recombination of core genes was not only detected within CII-VI but also between CII-VI and enteric genomes. CII is monophyletic and justifiable as a novel Escherichia species rather than E. coli sensu stricto based on genome sequence similarity DDH and ANI analysis. CII strains displayed genomic signatures that are consistent with divergent adaptation to gastrointestinal and external environments. Overall, gene degradation was more prominent in the gastrointestinal CII strains. The trained random forest model identified predictor genes that were informative of habitat association. Functional divergences in many of these genes were reflective of ecological divergence.
Date of Award2021
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology
SupervisorStanley Chun Kwan LAU (Supervisor)

Cite this

'