I hurt my left knee over Labor Day weekend, probably doing a lazy breaststroke in the pool. After limping around for a few days, I was able walk normally again. But only for a few minutes, until pain set in.
The bot and I only chatted for about five minutes, but our conversation felt like a slog. I told the bot my knee made a clicking sound, but there was no way to specify how far I could walk without pain. I didn’t know how to answer the seemingly straightforward question about where I felt pain, and I couldn’t decide whether my knee felt weak or like it might give out.
Early in the symptom-checking process, I was asked to guess my diagnosis from a peculiar list of possibilities: “cellulitis, arthritis, allergic contact dermatitis of the knee, fear of dogs or cancer.” After I answered 24 questions, the bot spit out “patellofemoral pain syndrome” (aka runner’s knee) as a probable diagnosis. It told me to take ibuprofen and stretch, and then see a doctor in a few weeks if I still had pain.
I was going to see my doctor no matter what the bot told me. Some experts worry, however, that a vague or inaccurate chatbot diagnosis could endanger a patient’s health or even their life. Critics say the algorithm of one of the best-known health chatbots, developed by the British virtual care company Babylon, is dangerously flawed. Dr. David Watkins, a consulting cardiologist with the NHS in London, pointed out that in a test, the bot accurately diagnosed signs of a heart attack in a theoretical male patient but misdiagnosed similar symptoms in a theoretical female patient, a pack-a-day smoker in her 60s. The bot said “she” was probably having a panic attack.
Healthcare bots ask questions to narrow down and rule out the causes of symptoms, but they can’t have the kind of organic conversations that often help doctors uncover seemingly unrelated clues about patients’ health, says Dr. Satasuk Joy Bhosai, chief of Digital Health and Strategy at the Duke Clinical Research Institute, part of the Duke University School of Medicine. Bhosai recalls a patient who experienced frequent headaches, one of the most common complaints she hears from patients. The patient noticed his hands were moving slower than usual when he played the piano. It was only when he mentioned this in passing that Bhosai ordered a brain MRI, which revealed a tumor. The tumor prompted Bhosai to ask about the patient’s past tobacco use, which eventually led to a lung cancer diagnosis.
“If a chatbot asked the patient if he was a smoker, he would’ve said no, and the diagnosis would’ve been missed,” Bhosai says.
Legal disclaimers for most health chatbots say they shouldn’t be used to diagnose or treat medical conditions. Yet their ability to proffer accurate diagnoses is one of their most highly touted selling points.
“I’m not aware of any eHealth chatbot that can be relied upon for diagnosis,” Watkins says. “[And] it’s important to be aware that in the UK, eHealth chatbots are only loosely regulated, with no requirement for validation or regulatory approval. Consequently, there’s a huge variation in accuracy and safety of the chatbots available, ranging from the good to the downright dangerous.”
The “first stop in patient care”
Despite concerns that chatbots are too, well, robotic in their interactions with patients, tech giants and startups are pursuing contracts to implement them in large health systems and hospitals. These chatbots, also known as “conversational agents” or “patient-engagement software,” will revolutionize healthcare, proponents say.
By providing more efficient patient triage and diagnoses, the bots allow healthcare providers to spend more time with the neediest patients. Increased efficiency also helps address doctor burnout and physician shortages by helping patients figure out what they can DIY at home, and by pre-diagnosing patients and directing them to the right specialists. Chatbots might also democratize healthcare by increasing access to nonjudgmental medical advice. And unlike humans manning phone lines, chatbots can answer patient questions any time of day.
Chatbots can remind patients to look out for concerning symptoms, or tell them when it’s time to change their bandages or take medications.
“Chatbots vary,” says Enid Montague, an expert on human-centered automation in medicine, as well as a professor of computing at DePaul University and an adjunct associate professor at Northwestern University Feinberg School of Medicine. “They can be a simple decision tree or can get more complex by being fed lots of data and using machine learning to make a diagnosis. It could look like a survey or like you’re having a conversation with someone.”
In the US, chatbots are already interacting with patients. A few hospitals, such as Boston Children’s Hospital and the Medical College of Wisconsin Health Network, use interactive symptom-checker chatbots. Others, such as Northwell Health in New York, use chat technology to monitor patients released from the hospital.
For now, though, most chatbots just schedule appointments or provide information about COVID-19. California healthcare network Sutter Health uses a bot called Ada to screen patients for COVID risk factors; the Centers for Disease Control and Prevention has a similar one called Clara. In fact, the pandemic has been an opportunity to showcase the value of chatbots, which have been deployed to field the onslaught of patient COVID questions. Developers of the technology hope this will help pave the way for broader adoption of chatbots throughout our healthcare system.
Experts say the technology is in its infancy and improving rapidly, and will soon become the first stop in patient care. The issue is how to integrate chatbots into care while ensuring they don’t dispense inaccurate or otherwise dangerous information to patients.
How bots can improve care
Chatbots can be helpful for collecting patient info before a visit and transferring that information to healthcare providers, says Tearsanee Carlisle Davis, a board-certified family nurse practitioner and director of clinical and advanced practice operations at the University of Mississippi Medical Center for Telehealth.
“I think they’re great for pre- and post-surgery care, as the providers and nurses are looking for specific things and the care is pretty standard across the board,” says Davis, who is also a professor at the UMMC School of Nursing. Chatbots can remind patients to look out for concerning symptoms, or tell them when it’s time to change their bandages or take medications.
A few studies support chatbots’ usefulness in patient after-care. A 2018 study concluded that text chatbots let nurses make fewer calls to chemotherapy patients. A smartphone chatbot also appeared helpful in following up with patients who had surgery for kidney stones, according to a 2019 study.
A study published in July found that people generally liked the COVID chatbots they used. In fact, if they trusted the chatbot, they actually preferred communicating with it over a human, says study author Alan R. Dennis, professor of internet systems at the Indiana University School of Business in Indianapolis.
Weill Cornell Medicine, in New York, uses its chatbot Hyro as an administrative aid and symptom-checker. Hyro can assist patients with simple tasks, such as finding a cardiologist in the health system who speaks Spanish, says Hyro content marketing specialist Ziv Gidron.
Chatbots can also help doctors spend less time educating patients. Researchers at the Boston University Medical Center created Gabby, a video “host” who walks patients through important preconception care. Gabby’s purpose is to improve the health of Black mothers and their babies, who experience disproportionately poor health outcomes compared to white women.
“Doctors don’t have time to do two hours of education,” says Gabby designer Dr. Brian Jack, director of the Center for Health System Design and Implementation at Boston University’s Institute for Health System Innovation and Policy and a professor at the BU School of Medicine. Gabby is more engaging than informational pamphlets, which, Jack says, patients typically don’t read.
Chatbots might also play a role in mental healthcare. For example, University of Illinois Chicago researchers are studying whether Lumen, a voice-activated AI program, can help patients manage moderate depression or anxiety. And a 2017 study suggests that Woebot, a free cognitive behavioral therapy app, helped relieve symptoms of depression in subjects, who were mostly young white women.
Researchers at the University of Pennsylvania acknowledged Woebot’s success in a recent opinion published in JAMA, but study authors also warn that conversational agents are “not yet mature enough to reliably respond to patient statements,” even when those statements signal self-harm by mentioning suicidal thoughts. Study authors question whether it’s appropriate for chatbots to offer diagnoses and treatment options when most companies’ research, if they’ve done any, is based on data, not real patients.
Other researchers have raised similar concerns. In a 2019 BMC Medicine paper, study authors said it’s critical that “the minimal requirements used to make a digital health technology available to the public are not mistaken for a product that has passed rigorous testing or demonstrated real world therapeutic value.”
Because tech companies’ algorithms are considered proprietary, study authors noted, they aren’t required to reveal how they work or whether they’re trained using representative data (rather than actual patients).
These are common issues in tech, where innovation is often prized over caution. But a “move fast and break things” approach can be particularly risky in healthcare. Because the Food and Drug Administration does not consider health chatbots “medical devices,” they don’t require FDA approval.
The FDA told Dr. Vipindas Chengat, CEO of patient-engagement software company MayaMD AI, the company’s chatbot would only be considered a medical device if it gave ‘specific, granular recommendations, such as decreasing medical dosage.” That saves companies from conducting extensive clinical research.
Even if companies are willing to do their due diligence, no guidelines have been established for how testing should be done.
“It’s challenging for startups to develop the evidence piece because they’re under pressure to scale quickly,” Bhosai says. “Studies can be off their radar. And sometimes they don’t want results and are not interested in having an academic center look at data they already have.”
Bhosai provides clinical guidance for companies who want to responsibly test their products and evaluate feedback. Hospitals and healthcare providers generally want to see data supporting efficacy before they adopt e-health tools, she notes.
Another issue as healthcare chatbots become more ubiquitous, Bhosai says, is that software developers aren’t required to make programs inclusive.
“There has to be some work toward managing the digital divide in health disparities,” she says. “I worry about the patient who can’t read or who can’t speak English, or older folks. Even if they use the internet, they might not be able to see well and read questions online. People of different [backgrounds] also might communicate differently; the way questions are asked matters.”
Systems are only as valid as the data that goes into them, Montague adds. “Biases can be built into the design, such as the assumption that everyone lives in a wealthy [neighborhood] and has a life expectancy of 90 years old,” she says. “We need good data that represents everyone and interdisciplinary teams of people well-versed in health and racial disparities actually happening in communities. We’re not really at that stage.”
Researchers and doctors are calling for greater transparency and responsibility from health chatbot makers before they’re unleashed on patients. Two new papers, published in The Lancet and in Bio Integration, discuss how to integrate AI into medical practice responsibly and ethically. In addition, the authors of the BMC Medicine paper mentioned above developed a Digital Health Decision-Making Framework and Checklist that covers patient privacy issues, responsible data management, ethical risk-benefit analyses and considerations to make tech accessible to diverse populations. The checklist was originally intended for researchers, but study authors say it would help developers and clinicians as well.
Rather than focus on whether health chatbots might replace human doctors, the more important question is how healthcare providers can use the technology to improve care, Montague says. MayaMD AI has a patient symptom-checker but also helps doctors evaluate test and imaging results, and suggests possible diagnoses and treatment options. Doctors can only keep so much information in their heads. An intelligent chatbot could catch a rare diagnosis a human might miss, she says.
Chatbots aren’t very sophisticated yet, Jack says, but the tech is less than 10 years old and it’s getting better all the time.
“In the future,” he says, “systems will learn more actively about patients and remember things about them. That’s how you connect with patients.”