The Alarming Blind Spots in Health Care AI

Artificial intelligence promises to make medicine smarter. But what happens when these software systems don't work as advertised?
stethoscope resting on a tablet resting on a laptop
Photograph: Getty Images

All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links.

Artificial intelligence is everywhere. And increasingly, it's becoming a critical part of health care. Doctors use it to try to suss out symptoms of deadly infections like sepsis; companies like Google are developing apps to help you identify ailments just by uploading some pics. 

But AI is only as good as the data sets fed into these systems. And when the data sets are flawed, or the results are not properly interpreted, the software can misidentify symptoms (or fail to identify them entirely). In some cases, this may even result in false positives, or exacerbate already stark racial disparities in the health care system.

This week on Gadget Lab, WIRED senior writer Tom Simonite joins us to talk about the blind spots in medical AI and what happens when tech companies put these algorithms into their users' hands.

Show Notes

Read Tom’s story about the flaws in the AI that predicts sepsis here. Read his story about Google’s new dermatology app. Read more about the racial bias in AI systems (and how those algorithms might be fixed). Also check out Lauren’s story about how the internet doesn’t let you forget.

Recommendations

Tom recommends the novel No One is Talking About This by Patricia Lockwood. Lauren recommends the book Girlhood by Melissa Febos. Mike recommends the album Acustico by Céu.

Tom Simonite can be found on Twitter @tsimonite. Lauren Goode is @LaurenGoode. Michael Calore is @snackfight. Bling the main hotline at @GadgetLab. The show is produced by Boone Ashworth (@booneashworth). Our theme music is by Solar Keys.

If you have feedback about the show, or just want to enter to win a $50 gift card, take our brief listener survey here.

How to Listen

You can always listen to this week's podcast through the audio player on this page, but if you want to subscribe for free to get every episode, here's how:

If you're on an iPhone or iPad, open the app called Podcasts, or just tap this link. You can also download an app like Overcast or Pocket Casts, and search for Gadget Lab. If you use Android, you can find us in the Google Podcasts app just by tapping here. We’re on Spotify too. And in case you really need it, here's the RSS feed.

Transcript

Michael Calore: Lauren.

Lauren Goode: Mike.

MC: Lauren, when was the last time an AI correctly predicted that you were coming down with something?

LG: I'm pretty sure that I'm relying on AI when I use apps like the Heart Rate Monitor or the Period Cycle Tracking on Apple Watch and I guess it does a pretty good job of that. But if you're asking me, has AI ever flagged anything that's been wrong with me? No. I mean, aside from the stuff we all know is wrong with me.

MC: Ah, well, I'm glad that there's nothing currently super wrong with you because we're going to talk about AI's role in healthcare on today's show.

LG: Sounds good.

[Gadget Lab intro theme music plays]

MC: Hi everyone, welcome to Gadget Lab. I am Michael Calore a senior editor at WIRED.

LG: And I'm Lauren Goode, I'm a senior writer at WIRED for as long as I can do my job before an AI takes it over, I guess.

MC: And we're also joined by WIRED senior writer, Tom Simonite. Tom, welcome back to the show.

Tom Simonite: Hi Mike, thank you for having me back.

MC: Now, Tom, you write about AI for WIRED, which is why we asked you on, but we actually asked you on because of your smooth melodic British accent.

TS: Mike that's kind, I'm also a big fan of your accent.

LG: What about mine?

TS: Your accent is also great, Lauren. Other accents are also available from all good suppliers.

LG: Tom, you're killing me. I'm already jealous that you are in the San Francisco office right now, in our podcasting studio, I can see you over Zoom, we hope to all be back there soon. But I'm already having the FOMO and now you're saying you like Mike's accent and I don't quite believe, not quite convinced you feel the same about mine, but that's okay, perhaps we should move on.

MC: Yeah. We're all still recording remote but there you are in our studio, how is it? How does it smell? Does it smell good?

TS: It smells super clean. When I cracked open the door, I was thinking, this is like a time capsule to probably roughly 18 months ago, I was bracing myself, who knows what would be in here.

LG: You're bracing yourself for the smell of snack fight.

TS: And I opened the door and it was super fresh because we have a powerful air purify in here keeping the virus particles down and I think everything else too.

LG: What brand of air purifier is that?

TS: It is a Coway Airmega.

LG: Oh, a Coway, we're actually fans of that at WIRED I think, right?

MC: Yeah we are.

TS: It's good. When you turn it on, it makes a cheerful happy tune that makes you feel safe.

MC: This is not sponcon about air purifiers, this is in fact a show where we're going to be talking about AI's use in healthcare. Now, Tom, you are one of the writers that we have on staff at WIRED, who covers all this stuff and you've written a story about AI's use in hospitals this week, and we're going to talk about that story plus another one later in the show. So to set up the first one, machine intelligence can no doubt be a useful tool for both doctors and patients but we know that it's not perfect. And the use of algorithm tools in healthcare settings can create new complications.

Like sometimes doctors don't know what to do with the information that the computer spits out or sometimes a poorly written AI program can up worsening the racial disparities that already exist in our healthcare system. But let's start with your most recent story, which is about sepsis. Some people might not know this, but sepsis, which results from infections, is the number one killer of patients in US hospitals. So when a company developed a software that uses an algorithm to alert doctors to the first signs of sepsis in their patients, it seemed like a good thing, but there are some flaws in that algorithm. Now, Tom, we're hoping you can tell us what went wrong here.

TS: Sure. And to start with, why don't we just back up a minute because I think we're at a really interesting moment in US healthcare. Back in the day, medicine involved bloodletting and leeches and all this organic stuff. And then science improved and medicine got pretty good but there was still all the data was mostly written on paper, you couldn't get it into computers and computers weren't really good enough to help out anyway. But fast forward to today, electronic health records are pretty common now and we have mobile phones and small computers that can fit into medical devices and big computers that can run sophisticated algorithms.

And so it's become a lot more practical to put software in the clinic or get it to help our doctors. And that's great because as good as medical science is, there are clearly lots of opportunities to help people do it more accurately. But now we're in this phase where we can deploy this stuff but we don't know a lot about how to make it work in the way that you would hope it would work. And so things are being deployed because it's possible to deploy them but not everything is getting properly checked over or tested before it gets put into deployment.

LG: Now, central to this story Tom, is a company called Epic, which is one of the biggest technology providers for medical health e-records in the United States, right? Describe that landscape a little bit of electronic health records and talk about Epic's role in this story.

TS: Yeah. The US has a pretty fragmented health system with all the private insurers and different plans and things like these and so it's been a little bit slower than some other countries to adopt electronic health records. But things are now going pretty well and Epic is the leading provider of electronic health records and so probably a good number of listeners would have their data lodged into an Epic system at a hospital or a health insurer or some other kind of provider. And that market is kind of competitive and there have been some issues with interoperability, it's not really in the business interest of a company that provides medical record systems to make it easy for you to get the data out and put it somewhere else. And there's also a competition between those companies to try and make their record systems more attractive by adding bells and whistles or algorithms.

And so the system I wrote about this week was an algorithm that Epic offered to its customers and it was pitched as a way to pop up alerts to patients that might contract sepsis, this very dangerous complication of infection. It's pretty hard to spot because some of the key signs are not uncommon in the hospital, like low blood pressure and other things. And so Epic system which was offered to its customers, they said, look, we've made this algorithm and what it will do is pop up alerts on patients that are at risk of developing sepsis and that way you can treat them earlier. And there's good evidence that minutes or even a single hour earlier treatment can save your life so that seemed like a great thing. But the company didn't release a lot of information about the system and how it performed and there hadn't really been any external validation of that system. And so some researchers at University of Michigan got curious about this and they said, well, why don't we test how good this thing really is? And they found that maybe it wasn't as great as people have been assuming.

MC: What did they find?

TS: So they tested the software on data from about 40,000 patients and they found that it didn't identify two thirds of the sepsis cases they have. And it found some that doctors had missed about 183 cases out of nearly 3000. If you were one of those 183 patients, you probably would be pretty glad that this algorithm was out there looking out for you, but it also threw up a lot of false alarms. And so when a patient was flagged by this system, there was only a 12 percent chance that they would have developed sepsis.

So many times when it was calling for the attention of staff, they were maybe diverting their attention or their time when they didn't really need to. And the lead author of the study, Karandeep Singh, the way he summed it up is for all those alerts, you get very little value. Many things in medicine are a trade off, right? I guess ideally you would have a team of doctors and nurses for every patient but you can't so you have to decide where you're going to allocate your resources. And the study was concluded that this argument was maybe just not worth the extra burden it was placing on stuff.

LG: Tom, there's a section of your story that really underscores how these flawed systems can end up discriminating against certain groups of patients and in particular, how this affects patients of color, right? You mentioned that back in 2019, there was a system used on millions of patients to prioritize access to special care for people with complex needs and it actually underestimated the needs of Black patients compared to White patients, and that's just one example. And so I'm wondering if you can explain for our listeners exactly how this ends up happening with artificially intelligent software and how ultimately this could perpetuate racist beliefs.

TS: Yeah. And this comes back to what I was saying before about we've reached this phase where we're deploying stuff without fully understanding some of the complexities of what happens when you throw a software into the health system. So in 2019, an algorithm that's used for millions of patients in the US to identify patients that have particularly complex burden of health needs maybe related to diabetes or a chronic condition like that, health systems use it to find those people and then they can enter them into the special assistance programs that might give them a bit of extra help. And a study led by researchers out of Berkeley found that this system was effectively undercounting the health needs of Black patients in particular compared to White patients. And so the effect of that in practice would be, if you had a population of patients and you were trying to select some for extra help, Black patients who medically had the same needs as White patients who had complex health needs, the Black patients would kind of be at the back of the line and they might not get that special assistance.

And the reason for that turned out to be the system was looking at billing and insurance costs as a measure of how sick a person is, but billing doesn't actually measure how sick the person is, it just measures how many times they go to the doctor and how many treatments they get given. And because of historical disparities in the US health system, Black patients with a particular health need typically get less treatment and incur fewer costs than White patients with the same conditions. And so this was an example of how the data you put into one of these automated or AI systems can make a huge difference to what you get out of it. And if you don't think carefully about what you're putting in you may be acting on garbage recommendations without realizing it.

LG: And to be clear, the sepsis study we're talking about is not specifically about racial disparities, it's about identifying problems where there may not have been problems for certain patients. But it's all part of this larger sort of concern that some folks have now, ethicists, researchers, data scientists, about the kind of data sets we're using to inform what ultimately create these AI systems.

TS: That's right. It comes back to this question of like, okay, well, how do we measure what's effective? And we should remember that there's so much potential here to use automation and algorithms and AI to improve healthcare and even to use these systems to reduce inequities in health care. So early this year, I wrote about a study on software that analyzes x-rays for arthritis and the study found that this system was actually less biased than the human doctors at reading these x-rays. There's a pattern in arthritis care where radiologists who look at Black patients x-rays and White patients x-rays, they're more likely to see problems on the White patients x-rays because of traditions in how radiologists have been trained based on data from White populations. And in this case, the algorithm was kind of filling in a blind spot, it was able to see patterns of disease in the x-rays of Black patients that the conventionally trained human experts were missing. So that's an example of how this technology can be really valuable and it's a reminder of why Epic and others are working on this stuff because there's so much positive potential.

MC: All right, let's take a quick break and when we come back we're going to talk more about how artificial intelligence is being used for your own personal healthcare.

[Break]

MC: Welcome back everyone, we just talked about how artificial intelligence can cause complications in hospitals but AI is also a growing part of how we're using our own personal internet connected devices. At it's I/O developer event last month, Google announced an AI powered dermatology tool, you can just take a picture of that weird mole on your skin, upload it and Google's algorithm will tell you if you have anything to worry about, that's the idea anyway, in reality, it's not so simple. Now Tom, you worked on another story this week about Google's AI powered dermatology service, what's the skinny on that?

TS: I see what you did there Mike.

LG: I was just going to say that.

TS: This was one of the cavalcade of new announcements made at Google I/O and I thought it was one of the most interesting because it pinpoints this fantastic potential future for AI medicine, right? If this technology gets really good, maybe we could just give the technology directly to consumers and you have it on your phone or your watch or whatever and it will tell you if you're sick or you have a problem without you having to go and see a specialist. And that's kind of almost what Google showed off, it showed a demo of this app where if there's something on your skin, you're not sure what it is and it doesn't fly away when you wave your hand at it, you can pull out your phone, take three photos of it and upload them to Google and Google will come back with a list of what it calls suggested conditions, things that it might be.

And this follows a handful of studies that Google published on algorithms that can detect different skin problems, problematic moles, things like that. And in those studies, Google has shown that it's technology could rival board-certified dermatologists at recognizing these things. And so the potential of having that in an app is very intriguing, but it's very preliminary and so we're still waiting for some details on how it will be rolled out.

MC: I also assume it works differently on different colors of skin, yes?

TS: That is a very good question. Google has been criticized for not having a good representation of different skin tones in the previous studies it has published and that has caused some people to worry a bit about this new app that's being pushed out to consumers maybe as soon as later this year. However, the company says that the data sets in its published studies don't represent its latest and greatest stuff and they say they've been working on making it work for all types of skin tones but they haven't released a lot of specifications or data on exactly how widely it's been tested, so that's something to wait and see about.

LG: So at the top of the show I mentioned my Apple Watch, which I do use for some health tracking. And it seems like every time Apple rolls out a new health tracking feature, it comes with lots of disclaimers like this is not a diagnostic tool and if you think something is seriously wrong you should call the doctor, right? Basically they don't want to be totally liable for this health app. And I'm wondering, first of all, if Google has put out any kind of caveats or disclaimers around this skin identifying application. And I'm also wondering what this is going to do to people who are already probably a little bit predisposed to diagnosing themselves with a deadly disease within six clicks. If I use this app, is every mole just going to be cancerous in my mind?

TS: I am also wondering those things, Lauren. So in Google's demo there's a disclaimer on the results after someone has their photos analyzed something like, suggested conditions are not a medical diagnosis and I think a couple of other disclaimers as well, Google says this is not a substitute for going to the doctor. But this is a new app and it also comes with some presentation that might encourage people to think that it is some kind of super expert. So when Google presented it at I/O, it was mentioned that this app was probably motivated by a lack of skin specialists worldwide for people to go to with their problems and the company also pointed to its past results saying that this technology could be more accurate than a dermatologist. And so there's a lot of open questions about how consumers are going to think about this when it's in their hands. Google has got its reputation for being really great at AI, maybe some people will think they should trust this thing to help them make a decision about their own health, that's something that dermatologist I've spoken to are a little bit concerned about.

MC: What sorts of ethical concerns arise when you're taking these medically sensitive photos and handing them over to Google for processing? I mean, these aren't like cat pics.

TS: No, they're very personal and they could be on very personal parts of your anatomy. I don't believe Google has released much information yet about how it would handle those photos, it has said that the app is so far approved for use in the European Union but not in the US, so I guess that would mean the trial starts in the EU where they have GDPR and other privacy protections. But I would expect that Google is probably going to say something like, we encrypt everything in transit, we delete it after we process it, but it is another step along the constantly slippery slope of lots and lots of new tech, it does great new things but you have to share data with a big company to make use of it.

LG: That's a good segue to another question Tom, and you've been doing some reporting on this. Which is this move towards on-device, it's a phrase that we hear a lot particularly in the world of AI, companies like Google and Apple and others have said, they're starting to take some of these machine intelligence functions that normally requires sending a bunch of data to the cloud, processing it and sending it back to the end user and instead doing some of this processing, this intelligent computing, on the device itself which in theory is supposed to keep information more private, right? So is this something that ultimately could run, quote unquote, on device and how private is that really?

TS: That's a great question. Could it run on device? Maybe. I think Google and Apple have both worked pretty hard to beef up their mobile hardware and their mobile software so they can process photos using machine learning algorithms on the device, so yeah, I guess we could see Google led the way. Yeah. Google and Apple in particular have been talking a lot about processing data on device, not in the cloud, and there's one way of looking at where it's just a great thing. If you're going to have your data processed by an algorithm yeah, kind of maybe feels better if that happens in your pocket and not on someone else's computer where you don't quite know what's going on. But some privacy scholars say that that's kind of a narrow way of thinking about privacy.

If Apple is processing your data on your iPhone, which it controls and is super secure, then that's certainly keeping your data confidential, it's between you and Apple and the iPhone but that isn't how some people think about privacy. For some scholars, privacy is a broader set of freedoms from being watched. So if a company is watching every single thing you do online or maybe around your house through a smart home device, just because they don't tell other people about it it doesn't mean that you don't feel watched or surveilled. And so one way of thinking about the on-device trend is that it could be a way to sugarcoat the general trend which is that every aspect of your life becomes enmeshed in a mobile ecosystem run by a big company. And that's just something to bear in mind next time you hear a company talk about the benefits of on-device.

MC: All right, let's take a quick break and when we come back we'll do our recommendations.

[Break]

MC: All right, Mr. Tom Simonite, you are the guest this week, so you get to go first. What's your recommendation?

TS: Thank you for the honor Mike. My recommendation is a novel called No One Is Talking About This by Patricia Lockwood, who is a poet and also one of the funniest people on Twitter and I guess kind of annoyingly talented because she's also written a great novel. The main character is she's an internet influencer I guess, but in the book she doesn't use Twitter or anything you recognize, everyone is on something called the portal. She got online famous for posting a question that went viral, can a dog be twins? And the book follows her as she sort of wrestles with the ups and downs and paradoxes and anxieties of life online which in a very relatable way, in a way that sort of captures some of the essence of the internet, in a way that I haven't read a lot of literature that does it. And the book also sort of follows what happens when a very offline family crisis comes along and she has to deal with that while also maintaining her internet existence, it's a really good read.

MC: It's like a plot ripped from today's non headlines, from today's calendar events.

LG: Yeah. It sounds like a plot rip from a Taylor Lawrence story in The New York Times. It sounds really wonderful though.

TS: It's very relatable and actually it touches on a couple of points. The problem you identified in your excellent recent feature Lauren, the problem that is not-so-charmingly known as the miscarriage problem at tech companies where they do have a things like, hey, remember this post, remember this photo and it may not be something that you actually wanted to remember, but that comes up a couple of times as to the kind of realness.

MC: Nice. The name of the book one more time.

TS: No One Is Talking About This.

MC: By Patricia Lockwood. Lauren, what's your recommendation?

LG: My recommendation is also a book by a woman writer, it's called Girlhood and it's by Melissa Phoebus, who is a wonderful writer, this is not her first book. It's a collection of essays, it's basically about the social conditioning that starts to sort of affect women when they're girls and how it changes the way that girls perceive themselves and how they're sort of conditioned to believe that they're supposed to act in certain society and how it just affects our whole experience growing up. And Melissa herself, it just seems like such an interesting person, she has a super interesting background. She at one point was a heroin addict, she is now recovered. She was a dominatrix in New York City, she did all of this before she got her MFA and became a writer. She wrote another book at one point called Abandon Me, which is about loss and abandonment, also incredibly emotional.

MC: It's a collection of short pieces, right?

LG: It is, it's a collection of essays. But I'm already just really immersed in it and think it's really, really powerful and she's such a powerful writer that I highly recommend checking it out.

MC: Nice.

LG: Mike, what's your recommendation this week?

MC: Well, it feel weird, it's not a book.

LG: Well, you could tack one on last minute.

MC: No, I got one all picked out, everything. I do extensive research and decision-making before we make recommendations of the show. I know it doesn't seem that way but that is actually what happens.

LG: Okay. So what's your recommendation?

MC: Okay. I want to recommend an album, I don't do this very often, but it's a piece of music, I want to recommend the new album by the Brazilian artist Céu, that's C-E-U. She's Brazilian, she sings in Portuguese, sometimes in English, but mostly in Portuguese. The new album is called Acustico and it's the Portuguese spelling of that, so it's A-C-U-S-T-I-C-O. It's an album that's all acoustic versions of her songs so it's kind of like a greatest hits album. She's got six or seven albums at this point in her career and she goes back and sing some of her most beloved songs but she does it with just her voice and an acoustic guitar. It's a pandemic album, normally she records with a full band, she has these really intricate arrangements that makes electronic sounds and live band sounds and that's what you would normally expect from an artist like Céu.

But this album is just her voice and a guitar, and it's just fantastic, it's like a redistillation of everything that she's done at this point in her career. I'm recommending it because she's an amazing singer, an amazing songwriter, a really great gift for pop melodies, also touches of avant-garde throughout her music. But she's not really well-known elsewhere in the world, she has maybe a 100,000 streams or 10,000 streams on Spotify for most of her tune. So she's not a superstar but she's absolutely fantastic and not enough English speaking people know about her and know about Brazilian music in general. So she's sort of a great sort of path into what's going on in modern day Brazilian pop world. So that's my recommendation, Acustico by the Brazilian artist, Céu.

TS: That sounds good, can you hum some of it for us?

MC: I could but I'm not going to.

LG: Adding it to the summer playlist.

MC: All right. Well, thank you all for your recommendations, those were great. And of course, Tom, thanks for joining us and telling us all about AI in healthcare.

TS: Thanks for having me. It was fun.

MC: And thank you all for listening. If you have feedback, you can find all of us on Twitter, just check the show notes. This show is produced by Boone Ashworth. We will be off next week because of the 4th of July holiday. It is inconveniently right in the middle of a long weekend so we're going to take that time to clean up our studio, that Tom is sitting in now, disinfect it once again and then move in and hopefully at some point in July, we'll be recording with real microphones in a real room, breathing each other's air, can you imagine it? It's been so long. Anyway, we will be back with a new episode on July 9th, until then, goodbye.

[Gadget Lab outro theme music plays]


More Great WIRED Stories