Medical Data Pooling: Between Privacy and the Common Good

Medical Data Pooling: Between Privacy and the Common Good

“Suppose that we are wise enough to learn and know – and yet not wise enough to control our learning and knowledge, so that we use it to destroy ourselves? Even if that is so, knowledge remains better than ignorance.”

― Isaac Asimov

Given our familiarity with today’s medical procedures, we perhaps recognize that the brisk pinch of pain from a needle is a small price to pay for a peek into our own bodies – cholesterol levels, blood cell count, glucose levels and other health-related information. But in the expanding world of big data, scientists have developed new profiling tests that yield far more data that can account for thousands of variables. In recent studies, such as the integrative personal genomics profile (iPOP) conducted in Stanford University, scientists have developed algorithms that take blood samples and utilize medical databanks to produce comprehensive analyses. Every sample generates about 30 terabytes of data points about an individual’s health. (That’s enough CD-quality audio to play ceaselessly for seven years, by the way.) And this is only the tip of the iceberg in the field of medical data pooling.

Medical data pooling is a growing area of interest within the field of bioinformatics, marrying computer science skills with biomedical analytics. It entails the collation of vast amounts of research results and existing medical records into a databank. This allows for greater advancements in biomedical studies, as well as more effective diagnoses and treatments.

One key development in recent years is the growing impetus to make this data freely available. In 2012, the National Institute of Health (NIH) announced it would provide researchers with free access to all 200 terabytes of the 1,000 Genomes Project (a catalog of human genetic variation) via Amazon Web Services. About30% of healthcare providers utilize cloud computing systems or health data banks, like the Health Record Banking Alliance, that allow doctors to use a patient’s medical history to improve care coordination amongst various health institutions. The ramifications could not be more significant. The Info/Law Harvard Privacy blog surmised that opening up private healthcare databases could havesaved us from 90,000 unnecessary heart attacks and 25,000 deaths.

But performing a medical or genomic experiment on a human requires informed consent and careful handling of the contentions surrounding privacy issues. Where medical data and history are concerned, divorcing private and public spheres is an almost impossible task.

While the medical breakthroughs from data pooling are remarkable, there is a costly trade-off in terms of privacy. From a pro-privacy standpoint, valuing one’s privacy and obtaining informed consent for the collation of personal medical data is more than a bureaucratic procedure; it is the acknowledgement of one’s autonomy. Regardless of the benefits of an improved diagnosis, it is arguably the patient’s fundamental right to disclose this information in the first place and to choose to accept or deny medical treatment. This underlying ethical conundrum becomes a salient concern – should you value the patient’s health, or the patient?

In fact, the dialectic about privacy is a recurring one in the field of public health and policy. If medical personnel know that a patient is engaging in risky behavior like unprotected sex or drug usage, rather than forcing this patient to change his fundamental behavior, they could abide by the harm reduction principle. This means trying to minimize the risks that the patient faces, for example by providing clean and safe needles. This is the notion that respecting someone else’s life choices is more important than imposing one’s ideas about health and safety on them. The belief can perhaps be extended to the tension between medical data pooling and privacy, that when a patient is not willing to give consent to disclose information even if it can save his or her life, a doctor should abide by those wishes and prescribe treatments accordingly.

Another striking risk of medical data pooling is that of exposing the identities of research volunteers who were promised anonymity. In January, the journal Science published a paper detailing how researchers were able to identify 50 male participants who had anonymously donated their DNA for scientific research to the 1000 genomes project by cross-referencing their genetic information with data from their relatives.

The implication is that it becomes more difficult to progress in the area of open data advocacy. “We have been pretending that by removing enough information from databases that we can make people anonymous. We have been promising privacy, and … these promises are empty,” said John Wilbanks, founder of Sage Bionetworks, a non-profit that advocates for open data.

The contentions with medical data pooling weave into another lattice of privacy concerns, especially when other stakeholders are involved. Michael Snyder is one of many who gave his informed consent to volunteer his blood samples and medical information to the aforementioned 1000 genomes project (headed by the National Institute of Health, or NIH) and subsequently to public databases like Genbank. He wasn’t particularly worried about privacy, but his diabetes diagnosis (that surfaced through the 1000 genomes project analyses) meant that his family’s life insurance company quoted his wife an additional $7000 because his disease was medically codified.

Although the 2008 passage of the Genetic Information Nondiscrimination Act prohibits health insurance companies and employers from discriminating on the basis of genetic information, there is no protection when it comes to life insurance.

It is increasingly clear why personal medical information is so sensitive and why people can be so hesitant about the issue.

If we take a step back and look at a parallel to what is happening in the UK, we can see the same intricate issues unfolding within the new National Healthcare System (NHS) policy. The NHS, as of February 2014, has implemented a new policyto amass the medical records of every GP patient in the county into a single database. Since then, it has come under fire from some pro-privacy groups, with criticisms including the questionable guarantee of anonymity, the subpar awareness of existence of said policy and the lack of an option to opt out of the procedure. The latter points once again to the importance of consent.

Informed consent is conceivably the heart of the whole issue of medical data pooling. John Wilbanks in a Ted Talk strongly advocated the notion “that we reach into our bodies and we grab the genotype, and we reach into the medical system and we grab our records, and we use it to build something together.” In his consideration of the intricate privacy and ethical issues, he established, an experimental bioethics protocol in an attempt to create an open, massive, mine-able database of health and genomics information from many sources. In February 2013, the US government responded to a We the People petition spearheaded by Wilbanks and signed by 65,000 people, and announced a plan to open up federally-funded research data.

Medical data pooling has a lot of potential, but it is undeniably a controversy that must be carefully tread upon, especially with regards to free accessibility. The issue of privacy and the wide-reaching implications of medical data pooling suggests that individual consent is integral to capitalizing on this incredible, paradigm-shifting resource.

