Privacy-preserving computations on genomic data, and more generally on medical data, is a critical path technology for innovative, life-saving research to positively and equally impact the global population. It enables medical research algorithms to be securely deployed in the cloud because operations on encrypted genomic databases are conducted without revealing any individual genomes. This means that there doesn’t need to be a tradeoff between the utility, privacy and security of sensitive patient data.
Methods for secure computation have shown significant performance improvements over the last several years; however, it is still difficult to apply them on big biomedical data. The challenge of Track 2 of this year’s Idash competition focused on solving an important problem in practical machine learning scenarios, where a data analyst that has trained a regression model (both linear and logistic) with a certain set of features, attempts to find all features in an encrypted database that will improve the quality of the model. As a finalist, our solution, developed and presented by Mariya Georgieva and Nicolas Gama, is based on the hybrid framework Chimera that allows for switching between different families of fully homomorphic schemes, namely TFHE and HEAAN. It is the only bootstrapped solution submitted to the competition that can be applied for different sets of parameters without re-encrypting the genomic database, making it practical for real-world applications.
A solution built with Inpher’s open-source TFHE library was also awarded a prize in Track 3 of Idash, wherein the goal was to query for the longest matching DNA segments in large genome databases. The bootstrapping and circuit representation techniques implemented in TFHE allowed for the problem to be solved with a dynamic programming approach, and it turned out to be the best homomorphic solution for this genomic search.
Congratulations to our team and all of the finalists for this important work!