Skip to main navigation Skip to search Skip to main content

Accelerating exact similarity search on GPUs: strategies for distance calculation and K-selection

  • Ding TANG

Student thesis: Master's thesis

Abstract

Similarity search is a basic operation in database systems and widely used in industrial applications to handle complex data like images and user information, which are commonly represented by numerical feature vectors. This thesis aims to study how to better utilize GPUs for this task. We decompose similarity search into two phases, distance calculation and k-selection, and analyze their bottlenecks and solutions on GPUs respectively. For each phase, we explore several mainstream solutions and re-implement most of them with efficient codes. Additionally, we propose and implement several new optimizations, including SMML-S and SMML-L, two matrix multiplication kernel designs, and BucketSelect-Opt, a k-selection method to accelerate similarity search on GPUs. We conduct extensive experiments in different settings to investigate the performance of existing and our proposed methods. The results show that our proposed methods perform satisfactorily in their target domains. Furthermore, based on these experimental results, we provide guidelines on how to choose the right strategies for a given situation in each phase.

Date of Award2022
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology
SupervisorKai CHEN (Supervisor)

Cite this

'