Dimensionality reduction in patch-signature based protein structure matching

Zi Huang*, Xiaofang Zhou, Dawei Song, Peter Bruza

*Corresponding author for this work

Research output: Contribution to journalConference article published in journalpeer-review

4 Citations (Scopus)

Abstract

Searching bio-chemical structures is becoming an important application domain of information re- trieval. This paper introduces a protein structure matching problem and formulates it as an infor- mation retrieval problem. We first present a novel vector representation for protein structures, in which a protein structural region, formed by the vectors within the region, is defined as a patch and indexed by its patch signature. For a k-sized patch, its patch signature consists of 7k 10 inter-atom distances which uniquely determine the patch's spatial struc- ture. A patch matching function is then defined. As structures for proteins are large and complex, it is computationally expensive to identify possible matching patches for a given protein against a large protein database. We propose to apply dimensional- ity reduction to the patch signatures and show how the two problems are adapted to fit each other. The Locality Preservation Projection (LPP) and Singular Value Decomposition (SVD) are chosen and tested for this purpose. Experimental results show that the dimensionality reduction improves the searching speed while maintaining acceptable precision and recall. From a more general point of view, this paper demonstrates that information retrieval techniques can play a crucial role in solving this biologically critical but computationally expensive problem.

Original languageEnglish
Pages (from-to)89-97
Number of pages9
JournalConferences in Research and Practice in Information Technology Series
Volume49
Publication statusPublished - 2006
Externally publishedYes
Event17th Australasian Database Conference, ADC 2006 - Hobart, TAS, Australia
Duration: 16 Jan 200619 Jan 2006

Keywords

  • Dimensionality reduction
  • Protein structure matching
  • Similarity measure

Fingerprint

Dive into the research topics of 'Dimensionality reduction in patch-signature based protein structure matching'. Together they form a unique fingerprint.

Cite this