Ledger Insights spoke with Daniel Hwang an IEEE workstream leader on Health Data Representation on Blockchain and Quality Scoring. An essential aspect of representing data is how to preserve the health data privacy.
The project started in April so it’s still very early days and they’re just at the discussion stage. The aim is to submit something formal to IEEE at the end of the year or Q1 2019.
In exploring how to represent health data, Hwang commented about the privacy issues: “Having the actual data on the chain is unreasonable. And so it’s about the methods of how data is going to be accessed from those siloed areas.”
The solutions involve using cryptography to keep the data private.
Zero knowledge proofs
The first one is to use zero-knowledge proofs. Hwang said: “with zero knowledge you can prove something without actually showing how you proved it”. For example, in a health context, a question could be ‘is the patient’s blood pressure normal’. The response would answer yes/no to that question without revealing anything else about the patient, especially not who they are.
“That would definitely be a way that we imagine how data is eventually shared,” said Hwang.
In blockchain zero knowledge is implemented as zk-SNARKS which are used by the privacy-preserving cryptocurrency, Zcash. It’s also incorporated in JP Morgan’s Quorum, and nowadays it’s available in Ethereum and Hyperledger Fabric.
There are a couple of disadvantages. Firstly zk-SNARKS are computationally expensive or slow. Secondly, it needs a trusted setup. In other words, if you don’t trust the initial setup of the cryptography, it undermines the privacy.
There’s a new version called zk-STARKS which improves upon both of these weaknesses but isn’t free.
Public key cryptography intro
In a typical public key setup, you generate a pair of keys for yourself. One of them is private and needs to be kept secret, and the other one is public, and you can hand it out.
As a metaphor, consider a lock box which automatically snaps closed. So there are two elements, the lockbox and the key to open it. You can give someone the open lockbox and ask them to place a confidential document inside. You’re the only one that has the key to open it. The lockbox is equivalent to a public key, and the physical key is similar to your private key. The set of keys can be used repeatedly.
In digital terms putting health data inside a lockbox is equivalent to encrypting it. And the person with the private key can decrypt it.
nuCypher and proxy re-encryption
Hwang was working on a hackathon earlier this year which involved GDPR and the right to be forgotten. His team used nuCypher’s algorithms.
With nuCypher the data is always encrypted. With conventional encryption, the data is encrypted for one person using their public key. The issue is how do you share the data with somebody else?
The conventional way is to decrypt the data and then to re-encrypt it with the other person’s key. If you do that on a server, the data is exposed and unsafe for a short period. With nuCypher it re-encrypts the data for the person you want to share it with, but there’s no decryption.
When you apply this to health data, you may only be sharing the data with a small number of people, but you are providing full access to them. So you don’t know what they’ll do with the data. “There’s a limit to how protected that data is once someone is given access to it,” said Hwang. “If you download it, it’s only given so much protection there. But perhaps you could do something similar to how some web pages have disabled copy-pasting. But even that can be gamed.”
Homomorphic encryption
So is there a way to let somebody use the data without being able to copy it? Hwang said: “With homomorphic encryption, you’re able to run computations on encrypted data, but the data is never available.”
But there’s a big downside. “I worked in a genomics lab before joining Genentech. Some members were exploring homomorphic ways of running experiments on encrypted genomic information. It [takes] a hundred times longer.”
NuCypher recently released some software that speeds up the homomorphic process dramatically. Other companies such as IBM also provide homomorphic solutions.
“I think there are trade-offs between how much time you’re willing to wait in order to access or run an experiment on a particular set of data,” commented Hwang.
“If the data is valuable enough, especially in the case of health data the value in it is also in protecting it. Perhaps waiting a bit longer to get certain results or even to have access to that kind of data when you may not [otherwise have had access], you just have to wait a little bit more. I think it’s a worthwhile tradeoff.”
If you’re interested in learning more about Blockchain and Pharma Supply Chain, Hanson Wade is running a conference in Boston in October. The link includes a 10% ticket discount. We’ve found Hanson Wade events have strong speakers, hardly any sponsor promotional content, and provide excellent networking opportunities.