Google, Microsoft circle as India mulls extracting value from health data of 1.3 billion citizens

Certain businesses manage to invent the future and win present-day battles at the same time. In a commercial released on Fox Television in November, for a fleeting second, Microsoft allows you to glimpse it.

“I am in India, and I’m going to see how Microsoft AI is helping people from going blind,” says the commercial’s lead actor. For 60 seconds, it’s a wide-eyed trip down a crowded street. A man on a bike unpacks a fundus camera, and people begin to queue up for eye screening. The device is made by Bengaluru’s 9-year-old startup Forus Health, but it’s the AI (artificial intelligence) from Microsoft that is doing the magic. At least that’s what the commercial implies.

In reality, Microsoft is just getting started as far as screening of diabetic retinopathy (DR)—a diabetes complication that affects the eye—goes. Out of some 1,700 retinal cameras installed in India by Forus, the largest by any one device maker, Microsoft AI runs on just a few. Less than a handful, actually. The two companies wouldn’t talk but sources at Forus say the association is about screening and validation of the algorithm. The device checks out a person, sends the image to the cloud, the algorithm screens it for DR, and sends its interpretation back to the device or the doctor.

The commercial claim, therefore, of ‘helping prevent blindness in India, is over the top. But that’s Microsoft’s AI in healthcare—top-down, publicity-driven. In sharp contrast to Google, which is building products and databases bottom-up. In 2016, Google published its first DR screening work in partnership with Aravind Eye Hospital in Madurai, and by early 2019, it officially announced its algorithm had entered clinical use.  

Top-down, bottom-up, or lateral entry (think Amazon), big tech has set its eyes on healthcare. In the race for supremacy and greenfield business, Cloud plus AI is the new finish line, and healthcare, the new track. 

In a press release in August, Microsoft said it had ‘screened’ over 200,000 people using “the AI-powered API across Apollo Hospitals” for cardiovascular diseases, even predicting the risk score for some. (There’s no peer-reviewed publication yet.) Medically speaking, this is a mere warm-up before the marathon. Nearly all hospitals in India have dark data—unstructured data in PDFs which are hard to mine. 

“This is just a fancy way of selling their Azure cloud. Hospitals don’t have digitisation to achieve AI results. Azure or any other cloud service is just the highway, not the car. Hospitals will have to build their own car,” says the promoter of a hospital which, incidentally, uses Azure. “Even Starbucks has better technology [than hospitals],” he quips. 

Most big companies don’t have a commercial strategy yet. Certainly not for India, where healthcare is fragmented and the market for such services non-existent. For now, everyone is marketing the idea of what is possible. 

“There are two separate issues here. One is the privacy of patients for which appropriate standards and safety must be in place; the second is valuation,” says Srinivas Sadda, director of Doheny Eye Centre, University of California in Los Angeles. The owners of health data are obviously the patients and the government, or whoever pays for it. The two co-owners somehow need to retain the value of their data. But it’s early days and no one has figured out how to value data, he says. Sadda is a proponent of open-sourcing health data, leading the charge in the US for institutional data sharing. 

As New Delhi works out the nuts and bolts of India’s new Personal Data Protection (PDP) Bill, which was presented to the Parliament last month, it’s crucial to get the balance right. To avoid making draconian laws that leave Indian patients untouched by AI, and yet extract value for the health data of the population that big tech is coming for.

Each of these companies is either driven by investment or stocks. There’s peer pressure to be accepted as a big data player. A big AI party is going on and everyone is saying let’s join the party. You do things for PR. Which is fine. There’s a complex play of actors in this which we have understood

A doctor-administrator at a large hospital in south India

The real cost of data

The PDP draft designates health data to be “critical” and “sensitive”, requiring a set of permissions from the owner before being used. Given that there are three actors in health data co-creation—the patient, hospital or doctor, and the payer—the bill ignores consent and rights at different stages of the data life cycle. 

Fixing data ownership in India is tricky because not only there are many payers, a good amount of healthcare is paid out of pocket by patients. Now with big cloud service providers, who are also heavy-lifting data clean-up at hospitals, AI becomes the fourth actor in the data play. What is fair use then? 

Google and Aravind did not share their terms of agreement, but it’s worth asking what Aravind gets out of this. Once the AI product for DR gets regulatory approval, will Aravind get to use the product for free? Or will it get royalties in the future? Will Google continue to provide tech support? Understandably, as a not-for-profit hospital, Aravind is not profiteering, yet what do patients get out of this? 

In December, Google published another study where its deep learning models looked at around 600,000 chest X-rays from Apollo hospitals. At least three AI researchers said it was “an average” paper, quite unlike Google’s path-breaking work in ophthalmology. 

“Since Google’s AI team is possibly the best on the planet, it is surprising that the authors chose not to have a truly independent data set to test the model on. It is quite well established nowadays that any AI algorithm should be tested on data which is not from the same source as the data on what the model was trained on,” says Vidur Mahajan, promoter and associate director at New Delhi’s Mahajan Imaging. (His group at sister company CARING has come up with a framework for clinical validation called the Algorithmic Audit.) It’s surprising that the authors chose to develop and test their algorithm for only four findings, says Mahajan. Still, he says, Apollo Hospital’s contribution of about 800,000 images is game-changing.

Even if money for data is not changing hands in these projects—Google, Aravind and Microsoft did not respond to specific questions on their respective agreements—people in the industry say hospitals must be earning some revenue for their time. “If you don’t have your own technology, it’s tempting to take money and organise data,” says the hospital promoter quoted earlier. 

Assigning fair value to data, though, is hard.

“In the case of the UK, it is simple because of the single-payer system. The government is the payer and hence has a big claim on data,” says Sadda. 

The National Health Service (NHS) of the UK is perhaps the most valuable data repository in the world. Primary care records go back decades and have authentic historical and current data on 55 million people. Add to it secondary and specialist care and its value, according to the audit firm Ernst & Young, comes to £9.6 billion ($12.67 billion) annually. In December, Amazon signed a contract with the NHS, where Alexa, Amazon’s virtual assistant, would use NHS data to offer expert health advice. 

For AI, data must be longitudinal—spanning a person’s health history. “Most Indian hospitals don’t have a loyal patient following. Which is why Google found Aravind so valuable; people keep going to this hospital for eye treatment,” says the founder of a health-tech startup in Bengaluru which has struggled to get access to large hospital data. 

From the big tech-hospital associations, however, good quality data may emerge over time. As might some products. 

Big tech: small price for big bets 

If Aravind shares data and Google makes its tech available to regularize screening, others will join. India can create a registry to track patients whom we lose in follow-ups. This’d allow us to start screening them in hospitals, similar to what DOTS [control strategy] did for tuberculosis

A medtech startup founder in Bengaluru

In 2016, Microsoft and LV Prasad Eye Institute (LVPEI), as part of an international consortium, came together to study the natural history of myopia at a population level. With myopia becoming a global epidemic thanks to longer exposure to device screens, the aim was to build predictive models for disease progression. The study is currently at the clinical validation stage at LVPEI and Bascom Palmer Eye Institute in Miami, US.

“Preliminary data suggests 68-70% of the time the margin of error is 0.25 Dioptre. It has to be tweaked to make it better,” says Anthony Vipin Das, associate director and consultant ophthalmologist at LVPEI. “We’ve signed ‘one’ agreement with Microsoft to do ‘one’ model. We want to do it right, with clinical validation. Repeatability of the outcome is important,” he says. 

With LVPEI, Microsoft may have lucked out. The Hyderabad centre has the world’s largest integrated ophthalmology electronic medical record (EMR) system, EyeSmart, installed at 225 locations in India and overseas with 5.4 million patient records. As many as 2,500 healthcare users and 7,500 patients go through it daily. Even before Microsoft came, LVP was on a slick tech adoption path, doing real-time analytics and capturing clinical, personal and financial data of patients. (Our earlier story here.)

What happens once the myopia model is fully tested? 

“The understanding is the algorithm will be deployed on the cloud that anyone can subscribe as a service. SaaS (Software as a service) model is what we set out with because it is scalable, upgradable and modifiable at the implementation stage,” says Das. 

Traditionally a B2B company, Microsoft may restrict itself to the enterprise business. It’s probably looking at healthcare AI as a way to boost its cloud business. It signed up SRL Diagnostics, India’s largest diagnostic chain (by number of labs) in 2018. While the press release plays up the AI hoopla, industry insiders say it’s a cloud deal at its core. 

“Why just Microsoft, there’s a battle in the country between [Microsoft, Google, Amazon] for cloud business. If you get G-suite accounts, even for five users, you get unlimited storage. We tested it for 10 terabytes. That truly is unlimited,” says a hospital chain owner in Delhi. (Like other entrepreneurs quoted in this article, he did not want to be named because he has business relations with these tech companies.) 

It’s quite likely that big tech won’t directly sell medical software to hospitals. They’d develop algorithms and open source it so that any AI product developer/researcher would choose her vendor based on who has better pretrained models, say, for X disease. In short, big tech will optimise their cloud for running machine learning models. As the hospital promoter said earlier, “Hospitals will have to build their own cars”.

Still, Google is in the information business, a B2C company to boot. Consumer plus health data make for a deadly combination in targeted selling. Academics in the US say Google tried getting data from institutions, beyond what telemedicine platform EyePacs had released in the public domain but failed. It then turned to India. And look how fast it has grown—from Aravind, it’s now extended its collaboration to eye hospitals Sankara Nethralaya in Chennai and Narayana Nethralaya in Bengaluru. From screening DR with Indian hospitals, it has moved up to predicting the progression of the disease by working with Swiss pharma major Roche. That it is seeking regulatory approvals, certification in Europe (CE marking), its commercial ambitions in healthcare are no longer ambiguous. 

Could it also get into the hospital network in India? In the US, it already has

At a health tech conference in Las Vegas in October, David Feinberg, Google’s head of health, gave a snapshot of Google’s plan. A search bar on top of an EMR that’d work like its regular search engine. It’d throw up medical insights instantaneously, also letting the search giant “know what one is thinking about”. Just like your regular search. 

Tech’s ambition is well-timed.  Hospitals everywhere are contemplating on-premise versus cloud storage investment, especially when the latter is coupled with cybersecurity. They know they can’t buy or optimise servers and other equipment as well as big tech can. Cloud providers, meanwhile, are hitting a wall in their return on investment. They need large data sets for scaling their deep learning models, which are then better able to solve a broader task. Big tech and healthcare may look like a match made in heaven, except that ethics and morality come in the way. 

As long as permissions and rights are clear, it really comes down to institutions and what they want out of their data, says Doheny’s Sadda. “I do believe institutions are getting more savvy. I also believe as time goes by, patients would demand more transparency and understanding from the providers. We are not there yet.”

Have data, will make it sweat

Is there a way governments can trade data with big tech? 

Policymakers sometimes drive hard bargains with pharma companies for lowering drug prices. It may not be the same thing for AI, but India has a chance to winkle out some benefit for its population. 

How about open-sourcing de-identified data as a public good? 

“That is a good goal—the world coming together to say let’s all help each other,” says Sadda. But it really comes down to this—was permission given for that to happen. India’s new regulation can come in here to say as long as rules of anonymisation are met and people’s privacy is protected, data can be made available for public use. The government can figure out a way to extract value, Sadda suggests. 

Now, to the elephant in the room: Would Aravind and Google share their ophthalmology data? 

It’d allow public health screening for DR in India, which has the highest burden of the disease. That way, the final benefit will come back to patients, without anyone bothering about individual paybacks. 

Apart from good publicity, companies can open up their tagged data sets to act as test sets for benchmarking algorithms. India has no high-quality benchmark datasets available to anyone—startups or academics—for unbiased, independent evaluation of AI algorithms in any sphere of healthcare. “We have students, engineers and scientists in India building algorithms using X-ray data from the United States simply because no Indian data is available,” says Mahajan. 

There’s even a minor commercial gain. “For someone like Google, hosting an open radiology dataset on their cloud will prompt more first-time users to use Google cloud graphic processing units(GPUs)–a high-ticket offering. Maybe there’s enough value in it for Google if they don’t want to build a healthcare AI product,” says a health tech startup founder in Mumbai. 

If none of the above matters, what’s happening in the self-driving sector should be an eye-opener. In August, competing tech companies came forward to share data because they realised their autonomous driving models were just not good enough on small data sets. 

“That’s very much a prophecy for what’s going to happen in healthcare,” says Matthew Lungren. As co-director of Stanford Centre for AI in Medicine and Imaging, Lungren is another champion of open-sourcing health data. “Imagine if each hospital [in India] works with one company because they sold their data to them. You’ll have 50-100 different algorithms built on individual data sets which will be inferior to an algorithm built on the combined data sets. Ultimately it’s hurting the patient because of the economics of not sharing data.”

Sharing ain’t simple

The unbridled march of AI would increase the cost of data in the long run. 

“I know a few doctors from India who go to Africa for cataract surgeries [as part of camps] and collect data on their devices. You see a shift to the new currency—data. With the advent of big tech, there might be a shift from care to another product which is a source of revenue,” says the founder of a medical device company which also runs AI algorithms. He often fields interests from tech companies for data. It’s not happening at a large scale but could happen in the future. 

India’s healthcare system is already stressed, and if it gets into a data play, the stress will only worsen.

At LVPEI, Das appears agitated that the AI narrative is hijacked by big tech. “It should be driven by the providers, doctors, surgeons, nurses… But it is driven by company A, company B, company C. I get bombarded by CEOs for data because of EyeSmart,” he says. As opposed to banking or financial services which are awash in data and crunching it well, healthcare takes time. “You do a treatment, wait to quantify it, and feed it back to the EMR. It has to be run by the champions who understand the value of giving that extra insight to the patient.” 

In the developing world, AI should be about enabling access, not data lockdown. Lungren believes one way to affect the data economy is to open-source enough data so that the “value is so low that no one wants to do it anymore”.

As you read this, the Ministry of Electronics and Information Technology is finalising the nitty-gritty of the PDP Bill. If India doesn’t want a repeat of its fintech flop, where it bent over backwards to accommodate big companies, it needs to get two things off the ground: 1) Mandate that all medical records be truly electronic; 2) Create a transparent framework for sharing data, fixing accountability, and providing safeguards.  

Say, for instance, an imaging lab shares 100,000 brain images. Someone somewhere discovers  five images with brain tumours which went undetected and intimates the lab. What should the lab do? Re-identify those patients and give them the new diagnoses? Or should it let them be? This is an epic ethical dilemma. There’s no provision for re-identification in the PDP Bill. 

According to Mahajan, who at CARING wants to open source some of its health data, there’s another dilemma. A legal one. Imagine a business becoming wildly successful using open source data. Could someone later seek a share in its financial gains because her data was part of the open source data set? What is fair use in this case? Mahajan says the West has solved this problem by allowing de-identified data with adequate permissions to be used and shared. 

India can learn and do better. 

“The thing about policymakers is that they like to do it one time—the concept of constant tweaking or revising is not there,” says Sadda. But in this case, nobody has a perfect model; nobody has figured it out. So the most important thing for them is to know that they are not going to get it right from the beginning. “This is one policy, whether we like it or not, which will require a lot of tweaking and many iterations in a relatively short period of time.” 

With the draft PDP Bill, India’s time is ticking.

Leave a Comment