Skip to content

Why a chatbot might seem more empathetic than a human physician

An April 28 article in JAMA Internal Medicine, “Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum,” generated a great deal of discussion — much of it horrified.

In particular, people are focusing on the study’s conclusions: that “chatbot responses were longer than physician responses, and the study’s health care professional evaluators preferred chatbot-generated responses over physician responses 4 to 1. Additionally, chatbot responses were rated significantly higher for both quality and empathy, even when compared with the longest physician-authored responses.”

While this study is thought-provoking and has generated much interesting discussion, that conversation isn’t focusing on the real root of the problem in physician-patient communication: time.

When I read the article, I found myself thinking about a recent conversation with my extended family of non-physicians. I explained why doctors have hours of work after we’re finished seeing patients. “What is it that you have left to do when all the patients have gone home?” they asked. It has to do with electronic medical records, I told them — more specifically, something called the EMR in-basket.

In our electronic in-baskets, we physicians receive a daily deluge of messages, most unrelated to the patients we are scheduled to see that day. These communications are connected to all aspects of patient care and can include test results, consultation reports, and direct messages from colleagues, nurses, support staff, and, yes, patients. Unfortunately, most of us have no dedicated time in the day to respond to them. The amount of data physicians have to sort and process each day has expanded exponentially, but the clinic practice model has not changed. For most of us, 100% of our scheduled clinic time is reserved for in-person or online patient visits.

We have to handle the data processing on our own time.

So, we steal little bits of time throughout the day — five minutes between patients here, three minutes between patients there, and during our “lunch hour.” In my 17 years in clinical practice, I haven’t eaten a lunch away from my desk (unless it was for a required meeting), clicking and typing between harried bites.

With this “stolen” time, I manage to cobble together about two hours of in-basket work during the clinic day (in which I’m scheduled to see patients face-to-face for eight hours). Most days, I still have about two more hours of electronic work in the evenings. This also can include finishing up charting from the patients I saw in the clinic that day but whose charts I couldn’t finish because, ironically, I was answering messages on other patients in the in-basket.

As I described this to my family, one older relative’s face fell.

“You mean,” she said with a horrified expression, “when I message my doctor, she’s given no time to respond? I always thought, I don’t know, that you all were given time for that.”

I quickly reassured her that even so, she should not delay messaging her doctor with a clinical concern or question. That is what the system is there for, and it’s not the patient’s responsibility to worry about their doctor’s schedule. Many clinics, thankfully, also have medical staff monitor the in-basket and address nonclinical questions, such as messages requesting a change in appointment time.

All of this provides important context to the recent uproar over the “empathetic chatbot” research.

Before we rush to accept that generative AI is more empathetic than human physicians, let’s take a moment to dive into the details and methods of the study. It is important to note that this was not a comparative study of “chatbot versus human” in real-world conditions, i.e., the EMR.

Because the researchers (thankfully) realized that allowing an AI chatbot access to actual patient medical records would be a HIPAA violation, they turned to another source: public and patient questions, and physician responses, posted to the online social media forum Reddit’s r/AskDocs.

The authors explain: “The online forum, r/AskDocs, is a subreddit with approximately 474 000 members where users can post medical questions and verified health care professional volunteers submit answers.”

This means that any physician or other health professional answering a post on this site was doing it of their own time and of their own interest and that they were not the physician of the patient posting the question. This is all very different from real-world conditions.

I can’t help but wonder about the demographics of physicians with the time and energy to go on this subreddit. For example, knowing how many of them are in current clinical practice would be interesting. (Because, one might assume, practicing physicians with a real-world EMR in-basket workload wouldn’t have the capacity to volunteer their time on a subreddit — the very idea of doing it makes me want to weep.) One might also theorize that if the physicians on this subreddit are not in current clinical practice, perhaps it’s something they don’t enjoy and isn’t their strength. Some physicians go into non-clinical pathways, and we don’t know how many of those active in the subreddit might be in this group.

In addition, Reddit is an online community known for fast-paced, witty banter. The health care professional volunteers on the subreddit r/AskDocs likely responded to the medical questions posted there in the style and manner of Reddit culture, not necessarily that of clinical culture.

But putting that aside, the real heart of the matter isn’t whether generative AI truly has the human capacity of empathy (it doesn’t), but how the shortage of physician time and corresponding high rates of burnout affect physician empathy.

In a 2016 article about how doctors are becoming less empathetic, the author explains how “time pressure can cause either a suspension of ethical decision-making or an internal conflict that overwhelms a person’s natural tendency to help those in need. This happens every day in the clinic and in the hospital, where medical professionals, many of whom were driven to medicine by a desire to help people, are tasked with helping two masters — the distressed patient in their office and a system that demands seeing more patients, faster. It creates an internal conflict that might explain what appears to be callous behaviour, where doctors appear to be missing numerous cues from patients seeking empathy.”

I know my responses to patients in the EMR are more thorough and empathetic when I’m not pressed for time and exhausted. Every once in a while, I will have a patient cancel or no-show without advance warning, and since we cannot fill their appointment time at the last minute, I’m given the unexpected gift of time. In those 30 or 60 minutes, I’ll turn to the always-present load of work in the in-basket. I can spend a bit more time on each reply knowing other patients aren’t waiting for me in exam rooms. My responses are thus a bit longer and more empathetic. I also can then take time to add the social niceties that ChatGPT does — and has presumably learned from its training sets of real-life patient-physician interactions.

(I’m not going to get into training datasets here. That’s an entirely different topic that deserves its own examination by experts in generative AI.)

I think it’s universally true that patients want their physicians to have dedicated, protected time to respond to them, whether in-person or electronically. They also, like my relative, have little to no understanding or knowledge of the pressures and forces that encroach on that time.

Similarly, physicians want to have the time to answer every patient’s message in a kind, empathetic manner. It’s not necessarily a lack of empathy that prevents us from doing so; it’s the external pressures that prevent us from being able to express it, especially electronically.

Some people may be thinking that perhaps I should use ChatGPT to generate my responses going forward when I’m short on time. But I think that’s the exact wrong lesson to take from this research — and the wrong arena for AI in health care.

The real potential of AI in health care will be in offloading non-human-requiring tasks to allow physicians to return our time and focus to where we all can agree it belongs: patients.


Originally published 5/4/23 by STAT News. (re-posted here with permission of the editor)


Listen to me discuss the article with STAT News editor Torie Bosch on the First Opinion Podcast!

Published inhealthcare and tech