Accelerating privacy-preserving machine learning with genibatch

  • Xinyang HUANG

Student thesis: Master's thesis

Abstract

Cross-silo privacy-preserving machine learning (PPML) utilizes Partial Homomorphic Encryption (PHE) to enable secure data combination and high-quality model training across multiple organizations (e.g., medical and financial). However, introducing PHE can result in significant computation and communication overheads due to the data inflation problem. Batch optimization is an encouraging direction to mitigate the problem by compressing multiple data into a single ciphertext. Nonetheless, this method is impractical for a large number of cross-silo PPML applications due to the limited vector operations support and severe data corruption.

In this thesis, we present GeniBatch, a batch compiler designed to translate PPML programs with PHE into efficient programs with batch optimization. To achieve this, GeniBatch adopts a set of conversion rules that allow PHE programs to involve all vector operations required in cross-silo PPML while ensuring end-to-end result consistency before and after compiling. By proposing a bit-reserving algorithm, GeniBatch avoids bit-overflow to guarantee the correctness of compiled programs and maximize the compression ratio. We have fully integrated GeniBatch into FATE, an industrial cross-silo PPML framework, and provided SIMD APIs to harness hardware acceleration. Experimental results of six popular applications show that GeniBatch can achieve up to 22.6× speedup and reduce network traffic by 5.4×-23.8× for general cross-silo PPML applications.

Date of Award2023
Original languageEnglish
Awarding Institution
  • The Hong Kong University of Science and Technology
SupervisorKai CHEN (Supervisor)

Cite this

'