Intel has partnered with Microsoft as part of a US Defense Advanced Research Projects Agency (DARPA) program that aims to develop hardware and software to drastically improve the performance of fully homomorphic encryption (FHE) computation.
As part of the program, Intel will develop a hardware accelerator that could make machine learning practical with always-encrypted and privacy-preserving data.
If successful, the multi-year effort could allow organisations from industries with strict data confidentiality requirements, such as healthcare, finance, banking or government, to easily share sensitive data with partners and third-party services in public clouds without the risk of exposing it.
"We'll be evaluating the platform on a number of workloads that span across statistical learning and machine learning," Rosario Cammarota, principal engineer at Intel Labs, tells CSO.
"And the performance target will be very ambitious. They [DARPA] are targeting to have the performance of the arithmetic be improved by more than five orders of magnitude compared to the performance of FHE software implementation on current CPUs. It's a big deal, because today we can't train machine learning models on encrypted data, but this platform within these performance targets would be able to do this type of training."
What makes fully homomorphic encryption appealing
Fully homomorphic encryption is a form of cryptography that allows mathematical operations to be performed directly on encrypted data (ciphertext) without the need to first decrypt.
The result of the computation is also encrypted and can only be accessed by the owner of the data who has the private key. This result is the same as if the same operations would have been performed on an unencrypted version of the data (plaintext).
Encryption can be used to keep data confidential in different states: at rest (storage), in transit (transmission over the network) and in memory (during processing). The security of data in memory has historically posed the biggest challenge and it's why trusted execution environments (TEEs) such as Intel SGX or ARM TrustZone have been developed in hardware.
These secure enclaves have their memory space separated from that of the primary operating system and can be used to perform operations securely on sensitive data without exposing it directly to processes running under the main OS.
However, there are limitations to these solutions. First, they pose the same key management challenges as encryption of data in transit or data at rest: The secret key needs to be present on the server or in the TEE where decryption takes place, and in the case of cloud computing, that means trusting the cloud operator.
Second, they don't really address the issue of trust and confidentiality in cases where the data owner is just the user of a service or application that runs in the cloud and is not the owner of the service or the application itself.
The most appealing property of FHE is that you can send encrypted data to a third-party service to be processed without providing them with the encryption key, so you're essentially getting the result of those computations without having to trust the service provider with the confidentiality of your data.
Take, for example, the case of a cloud-based medical predictive analytics service that uses machine learning. FHE could enable a healthcare facility to use such a service to help doctors deliver better diagnostics without having to expose protected patient data.
Similarly, the technology could be used to perform analysis on large data sets that combine both public and private data. For example, a drug development company could use FHE to safely analyse data from patients who tried experimental treatments but did not agree to be part of public clinical trials as part of a larger data set that includes data from such trials. This could be done using a third-party service running in the cloud.
According to Cammarota, the use cases can be expanded further with the use of multi-key homomorphic encryption, a form of FHE where multiple parties combine their separately encrypted data in a pool and then the processing is done on the entire data pool and the result of is only visible to the parties who provided the data.
Current state of FHE
While there are already software implementations of FHE and various libraries and toolkits from big industry names like IBM and Microsoft, their performance on current CPUs is poor because FHE has an inherent issue: data corruption.
Unlike typical encryption schemes that ensure data integrity, FHE does not, because the whole goal is to perform computations that alter the ciphertext, impacting its integrity. In crypto terms, FHE uses algorithms that are malleable by design.
"With each homomorphic computation, a certain amount of noise – or error – is generated that corrupts the encrypted data representation," DARPA explained. "Once this noise accumulation reaches a certain point, it becomes impossible to recover the original underlying plaintext.
"Essentially, the data in need of protection is now lost. Computational structures called 'bootstrapping' help address this untenable noise accumulation, reducing it to a level that is comparable to the original plaintext, but produces massive compute overhead to perform."
This issue severely limits the practical applications of FHE. According to Cammarota, machine learning inference is currently possible with some limitations, but machine learning training using FHE data is out of reach. To solve that, new techniques to achieve massive performance improvements for FHE workloads need to be developed and that's why DARPA set up the Data Protection in Virtual Environments (DPRIVE) program.
Intel joins DPRIVE
Intel's goal as part of the DPRIVE program, which will span multiple years, will be to develop a hardware accelerator for FHE in the form of an application specific integrated circuit (ASIC). But before such a system-on-a-chip (SoC) can be fabricated, the program's first phase will focus on identifying the key building blocks to accelerate the arithmetic of fully homomorphic encryption, Cammarota says.
This work will be done in collaboration with Microsoft, which plans to test the technology and incorporate it into their cloud offerings to drive commercial adoption. Both Microsoft and Intel are members of the Homomorphic Encryption Standardization consortium of industry, government and academia.
ASICs are integrated circuits that usually function as peripherals and are designed and optimised for a particular operation or type of operations. They can achieve much better performance for specific tasks than general purpose chips. For example, bitcoin mining rigs, which essentially solve a math problem for a reward of bitcoins, use ASICs instead of regular CPUs.
That said, some operations that previously required hardware accelerators have been integrated into general purpose CPUs over time. Floating-point units (FPUs), also known as math coprocessors, used to be implemented separately as add-ons but are now a standard part of most CPUs and GPUs.
Standalone crypto accelerators are still common, but some of their functionality has also been integrated into CPUs. The Intel Advanced Encryption Standard New Instructions (AES-NI) is a good example of a hardware-accelerated cryptographic operation that is now standard in most CPUs.
It's possible that, over time, the FHE hardware acceleration technology that will be developed as part of DARPA DPRIVE will take a similar path, but that needs to be evaluated by the different Intel business units, Cammarota says, noting that multiple teams at the company are collaborating on this project including Intel Labs, which is the research division, the data platform group and the design engineering group.
Making fully homomorphic encryption inexpensive is also a way to democratise the technology. That's exactly what AES-NI did for AES encryption, so it's completely possible, Cammarota says.
However, to achieve widespread adoption, it's also important that the standardisation efforts for this type of encryption continues and matures in parallel to the development of FHE hardware acceleration.
"When you look at homomorphic encryption as a security technology it's kind of a weird one," Cammarota says. "On one side, you look at the workload, and the fact that you can compute on encrypted data is quite exciting because you can build new business models and you can actually protect your assets now.
"On the other hand, it's a cryptographic technology and so when it comes to actual deployment, people will not only look at the excitement but will ask: Is it standardised? Can we deploy it securely? Do we know how to deploy it securely? So, there will be a tight collaboration between the outcome of the [DPRIVE] program and the standardisation evolution."