false
zh-CN,zh-TW,en,fr,de,hi,ja,ko,pt,es
Catalog
AI in Healthcare Virtual Summit Session Recordings
Gen AI in Clinical Practice
Gen AI in Clinical Practice
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
I'm going to speak about generative AI for patient messages. Promise, but much to improve. So last talk of the afternoon, you've made it. We've seen some very interesting research applications, things in the future. This talk is going to focus on something, I think, that is near and dear to everyone's heart, who deals with patient messages. That is, we're going to talk about an application of generative AI in a space that is not ultra-technical, not really so much a research place, but something that is a pain point among many physicians, and that is dealing with patient messages. Can AI help with some of the toughest parts of our jobs? I'm Dr. Jeffrey Moon. I'm an assistant chief medical information officer at University of Pennsylvania. And I'm going to talk about a specific use case of generative AI to this area, but also speak about generative AI as a whole. As you've probably heard, this is a very new space. And if you feel like you're very clinical, and this is maybe one of your first educations on artificial intelligence, don't think you're behind by any means. We're all learning about this. I look at it like Microsoft Word when it first came out. Today's Microsoft Word is completely different than the first one. It was, in the core, it was similar. But each iteration over the years, it changed. Now today's version is markedly improved and tremendous functionality compared to what we had at the beginning with software. AI right now is like that. So if you were a year or two, quote, late in using Microsoft Word, but eventually you got caught on looking back, you're like, I wasn't behind. Same thing with AI right now, especially within clinical practice. Let me start with a story. I have a friend who went to medical school and then residency with the same institution. I'm not a dermatologist. And I ran into him at a health tech conference. And like old friends, we started talking. And I sort of just said, hey, how's clinical practice going? This guy's good. He's won teaching awards. And he said, he smiled and he said, didn't I tell you? I'm not proud. I don't see patients anymore. And I was shocked. I thought, oh, no. Like, what happened? Maybe something bad? Did he get burned out? And he told me, he said, no, I left. I'm in industry now. And he said, it's good. He said, it's OK. But I left. And I said, what happened? You were so good. You were one of the top students. You got into Derm. And you won teaching awards. He said, it was the patient messages that killed me. It was desperation. It wasn't that it was an unequivocally wonderful opportunity elsewhere that he left clinical practice to work in industry. He said, I was totally burned out. It was especially patient messages. He said, he went to Disney World with the family, wife, dad, older son, younger daughter. Went for a week, had an empty in basket, came back home, and had over 200 messages from patients. He said, my heart sunk. I was shocked. He said that Monday. And this was a busy Monday where he had a double clinic. He double booked some patients. He said, this is too much. That was a moment where he left it. So patient messages, I'm sure I'm preaching to the choir for those who experience it. They come in all different forms. But the in basket form is the most common one. That's a pain point. And so people in a lot of health care leadership aren't even so much interested in AI for AI's sake. And I'm a clinical informaticist. Like, health technology is my jam besides seeing patients. It's using it as a tool. And this is what you're going to see, I predict, and others do too in the future. Using it as a tool to help with the pain points of clinical practice. Maybe even make us more efficient. Maybe even make us better clinically. But it may not revolutionize exactly what we're doing right away, but sort of help with some of the pain points. So let me explain this. This is, again, I'm at Penn. But this would apply to many other places. The green line here is the number of patient messages Penn doctors receive per year from patients. In 2023, we went above 6 million. And we're almost at 7 million at the end of 2024. Not that long ago, it was zero. So patients love it. This workflow is here to stay. But for us in clinical practice, this is not why we went to medical school, right? It's on reimbursed care, generally. But you have to do it. And here's some quotes from some of my colleagues. Our chief of endocrine said, nonstop messaging all day, every day in between patients, I see. There's new messages. On vacation, I check my in-basket every day. And often, it can take a minute per message. So let's get back to Gen AI. This is one of the main and the largest applications of generative AI in clinical practice in medicine right now. Across the United States, at various institutions, if you look at applied Gen AI to the regular boots on the ground doctors, how is Gen AI actually being used right now? This is like the biggest pilot that's sort of going on. Some places, generally, with lukewarm results. Some places with very good, but often only OK. And again, I look at this as like early software that it's only as bad now as it will ever be two, three, five years from now. Likely, hopefully, fingers crossed, a different and much improved software. So here's a good sort of clinical question, or maybe this is more of an operational question. Can Gen AI help with the millions of patient messages? Here's the proposed workflow. Again, some of you might be aware of this. If you're not, no big deal. I'll go through it. The patient sends a message. AI, sort of simplifying this, reads the message and generates a response. It's not a chatbot. It just generates the draft response for a nurse, for a physician, for an APP, maybe even like a medical assistant, to look at the drafted message and then perhaps send it back to the patient. And that way, we're sort of offloading the burdensome, at times unwelcome, it's OK, messages that patients send by the millions. So again, patient sends a message. AI generates a response. It has to be one of those four buckets right now. A physician or nurse would review the draft response, agree or edit it, and send it back to the patient. This, again, is not a chatbot. I got real, I have real examples coming up. I'm going to go to a bunch of them. So here's one that's not an endocrine example, but I deliberately chose more of a generalist one because it can apply to all different specialties as well. It's a scalable technology. In the black font top right, that's the patient who writes, excuse me, with the black font and the blue background, it says the patient wrote, hi, Dr. Lee, I wanted to know if you think I saw a doctor for pain management that it would help me. If so, who do I contact? Is there a doctor at Penn for that? You could call me up. Here's my number. And then here's AI's response in the black font. That is AI's response. It was not sent to the patient. You keep a human in the loop. You'll hear that. You've heard that in the past few days, but we're going to hear that more in the next several years about how do we keep, what are the safeguards for artificial intelligence? How do we make sure there's not patient harm? How do we make sure it works with our workflow? A lot of times it really is, there's again, this term of keeping the human in the loop. So the draft of response is there at the top right for this one example. Hello, Matt. Thank you for reaching out. Pain management can indeed be beneficial for many patients. We have specialists here at Penn who can help. However, I'd recommend scheduling an appointment with me at first so we could discuss your specific situation, blah, blah, blah. Not bad. Not bad. Okay. If we really want to take an operational sort of, one of the approaches that this whole new workflow should be perhaps less than the 50 second average, wherever it might take in your spot. Okay. I'm gonna get into a little bit more of the technical workflow, but that's not the point of this talk, but I wanna explain a little more detail. This leverages chat GPT to help answer patient messages, right? The patient message is de-identified. It's not tied to any patient level information. It gets sent to the model. It generates a body of text based on probabilities. It's not actually thinking. You've heard this. It's not even generating whole sentences. It's just predicting the next word. This is an audacious application, by the way. It's very ambitious, maybe too much so, honestly, for people who have been doing AI work. You already probably see AI in your regular email. When I write an email to my mom for picking up the kids from soccer practice, let's say I, my mom emailed, I don't know why. I write thank you very, in a light gray font, it says much. I didn't type it yet. I don't have to choose that word much, but it predicts based on probabilities. That's probably what I'm about to say, which is like correct most of the time. So that's generative AI in our daily lives. Okay, imagine that this is where it gets ambitious. You have a patient message to their endocrinologist. And when the endocrinologist tries to write back, not only are we trying to predict the next word, we're trying to predict two paragraphs that the endocrinologist might write back. That is like way more ambitious than just finishing your sentence. From another more technical perspective, it's a higher chance it might just fail or not do it. But, you know, immense burden. We got dermatologists quitting because of the in-basket burden. Might as well give it a shot, sort of. Okay, one of the things that's tough about a lot of the generative AI right now, including this model, is that it's trained on the internet as a whole. It does not have much increased weight towards medical text. So you would think if someone says, can I get my short-acting insulin refill, reasonable patient message, it would be nice if that kind of question, or even a more clinical question, the model had disproportionate weight in its response to the patient using endocrine texts, textbook, or the latest in journals and things like that. But that would be leveraging a technology called RAG, which a lot of this does not yet have. So it's really just general internet, which means you're getting general questions, which means it's only gonna do okay, probably, if you're following me. Okay, so at Penn, we had a group experiment with this. Penn's not unique. We're a little bit more of like a cutting edge group about this. This is not about Penn at all, or this is about just that this sort of balance of like, we wanna jump in and use Gen AI to like leverage the pain points, but like, is it ready? Is it safe? Like, you don't wanna like put all this time and energy into something that's gonna fail, but you don't wanna be on the sidelines forever when other places are benefiting. So this is sort of awkward, how risk adverse do you wanna be in your leadership approach? So we started with the questions of, is it safe? Is it useful? Does it actually make us more efficient? Let me give you a real example now. That one before with the pain medicine question, that was like sort of in an ideal environment. This is what real examples are. So here's an actual patient message. This is okay to share. It's the identified. Okay. Dear Dr. F, hope you are well. Will you give me a prescription for a blood test? But they wrote for blood tests so that we can review it during our annual checkup on Monday. Okay. We know what the patient's trying to do here. We have our endocrine appointment next Monday. And rather than sort of like, the patient's trying to be efficient. Rather than like see the doctor and then say, we're gonna check a variety of labs and then I'll call you with results or message you with results or something like that. This person's saying like, can you like call ahead lab core quest, something like that. And I'll get the blood work so that when we meet, we'll actually have like more data to talk about. Makes sense. Right. Here's the actual response. Okay. But generative AI trying to like make our lives easier on these millions of messages. Here it is in blue. If you're following. Patient is in black font. Generative AI in blue. Dear Kay, I hope you're doing well. That's nice. I don't always start with that. I'd probably just jump into the answer, but that's nice to say. Thank you for reaching out about the blood test. Very nice. We're very polite. That's good. I can certainly provide you with a prescription for the blood test. Not bad. During our annual checkup appointment, Monday, September. Wait a minute. Timeout. Didn't the patient want the blood test ahead of time? Wasn't that the whole point of reaching out? I want the blood test ahead of time. So when I meet with my endocrine doc, I can go over the results. We're like missing the forest from the trees here. I could certainly provide you with a prescription for the blood test during our annual checkup. If it's a finger stick blood glucose, fine. Anything else, you're probably not gonna be able to do on the spot. We can review the results together during our appointment. No, you can't. It just doesn't work like that, unfortunately. If you have any further questions or concerns, please don't hesitate to ask. Polite, but like, okay. Looking forward to seeing you on September 11th. Best regards, Dr. F. It's very, it's polite. It's nice. It's not that good though, right? That's not good. Like we're already so busy. Now you gotta like read this and you're like, well, what's it saying? You wanna like delete the whole thing or edit it. So this is like not a good example. This is not a particularly bad example, by the way. It's not a good example. It's kind of a middle of the road one, maybe tends towards like a bad example. Okay. Here's another one. Another patient message. Dear Dr. M, is lightheadedness a side effect of the Carvedo Law prescription I started last week? D. Reasonable question. We know the answer to this. Reasonable though. This is not a bad answer. Dear D. Lightheadedness, near fainting, feeling like your heart might skip a beat, especially when standing quickly are known side effects to Coring. You know, it slipped in a trade name. This is to be expected, but I'm sorry you're experiencing some of them, exclamation point. This medication is used to treat your high blood pressure, which is very important. Friendly reminder. See you on November 30th. Looking forward to it. Let me know if anything comes up. It's not bad, actually, that response. But we've been saying, so we ask people who work on this, it's like a good intern. I was one of them. That's what someone described it once. And I was like, okay, that's a pretty good way of saying it. Okay, let's scale up. I have other examples and we're getting some data. And again, I'm not so much discussing this specific, but my point isn't about this specific use of Gen AI. It's that there are system level strategies if you're work at a healthcare system or at a smaller group or you're sort of solo, there are ways that we're starting to use Gen AI in our regular practice. And that needs to be a part of our workflow, which this would be, and it needs to be good. Not just safe, but it needs to be good. So this has been safe. I'm not sure how good it's been. And I'll get into more detail. This is a lot of Gen AI. We're gonna see a lot of that. So here's more of our, I don't even know if it's the pilot cause it's not at scale, but it's something we're still, we haven't rolled it out to everyone at Pat. So since we went live, we're one of the earlier organizations. There's actually been over 150,000 generated draft replies since we started. They get used 35% of the time, 30% with edits, 5% without edits. Most of the early adopters are lower. We put a lot of time and energy. We're not like particularly smart or better at this. We just like really focus on it. So we had higher use, try to really build it into our workflow. That's what makes it more useful. It's about like naturally baking it into your workflow. And again, it's not just about this messaging. Anytime we're going to use Gen AI in the future, it should be largely a part of your workflows. You can learn new things, of course, but like if it's a total deviation from what you're used to, it's probably not going to be a good adoption. Interestingly, talking to the people use it, they say, and we have like lots of docs using it and nurses, 95% want to continue using, find it helpful, not burdensome and safe. It just might be dopey at times. I'll give you more specifics here. So here's some highlights. It's good for simple questions. The piloters, again, we're saying it's safe. we did not observe a perpetuating bias unused messages are still valuable because there's cognitive offloading unused messages meaning this you see a message you're like this is not what I want to do you delete the whole thing and you just write the message yourself sometimes it's still valuable and I'll give you example of how here's the main problems and we've seen this nationally this is not just a pending this is not just a pen observation this is a lot of like folks who use it the low lights are that it has an inauthentic voice let me give you an example one of the there's a faculty member who's who's been a pen for a very long time when the patient writes fall right now can you help me get a flu shot maybe they're asking the right endocrinologist maybe that's a family doc question probably is but let's just say they use this as an example can I get a flu shot is the message his response is likely yes how about next Thursday we have vaccine clinic in the evening what Chachi PT responds is thank you so much for asking about the vaccine exclamation point it is indeed important to prevent respiratory tract viruses with the use of vaccines they have been proven to be safe you can call the office to schedule he said I don't sound like that and while there's a lot of exclamation points and I that might be nicer to be patients know me they've known me for a long time and if they see I starts having all this flowery speech is how I described it with exclamation points they're gonna say that's not my doctor that's not what he sounds like and and and you could say so there's two ways to look at this is interesting this is sort of an interesting academic point because there was this paper that came out you might be familiar with this okay if you're not where we're like AI was more compassionate than human doctors you can read in that was like if you read into the details of that study it's not as binary as that because for a brand new patient there might be room for this flowery speech for established patients people you've known for 20 years you you might speak a little bit more condensed and quickly and while it might sound flowery it doesn't sound authentic now go back to the patient's experience of this because we've been talking about offloading doctors and let's talk about the patient experience of this if something sounds inauthentic that's probably a problem too now you might now some organizations have said doesn't matter we're gonna take it even if it sounds inauthentic because there's so many patient messages and and this is just the market is moving a direction I'm not saying it should be that way or not but that is an observation the second thing is it's very verbose the average doctor response is sure it's like one sentence to maybe a paragraph maybe you get very you know a very educated patient and they're very sophisticated medical issues and you want to really write them great response that's fine but on a whole the responses are much longer when AI is used okay we're about halfway in the day is almost over halfway into my talk you made it if you're multitasking just listening in and doing something else I get it that makes total sense you're educating yourself the way to get educationalist so let me describe this notion of cognitive offloading and there's a point to these pictures helping a dad like me draw a cat here are my kids I got four kids my wife's a doc too this is a lot of kids by the way it might be one too many anyway I have boy boy boy and then a girl that's scarlet scarlet moon bottom left you can guess her favorite color it's pink she's at a brother's baseball games forget them we're talking about scarlet my two-year-old scarlet as dad dad can you draw a cat can you draw a pink cat I was like I can draw a pink cat I'm a good dad I could draw a pink cat I get a pink crayon the construction but I think I could do that but I didn't just jump right in and I'm not good at drawing by the way but she's too oh I can do it but I didn't just jump into drawing the cat I google imaged pink cartoon cat and here's what came up here's the first one top left I was like no that one's too hard next one that's too that's hard to third one I know pink panther I grew up with him getting there okay that fourth one circle face triangle ears I can do this dad can do this and I start to look at all these cats and then when it was time I grabbed the pink crayon I drew a pink cat I didn't actually print any of these that's not what scar at once you wouldn't dad to draw a pink cartoon cat there's relevance to this and generative AI it's interesting because what we've seen is that when generative AI generates a message a lot of times it doesn't get used by some people but they got ideas they saw what it generated and they say what are the facts of of Corvada law and they see the top three side of it and they go okay they don't include that they might they did they really need gen AI to tell that no do they really need them like does it really know more no but when you're busy when you're tired this is pajama time and you're doing all these messages it kind of offloads your work and in your mind's eye the internet kind of helped you do your response so similarly when I did a pink cat it kind of looked like these cats which is an interesting thing to observe I think doctors are that this could are affected by this kind of stuff too and we don't know the full implications of this when gen AI starts to create summaries of our work when it makes it easier to find bone density in the chart when it makes it easy are we how I'm not saying this is necessarily a bad thing but how how much will our final thinking be shaped by what's presented to us we don't know we're investigating okay I'm going to some more stories showing the data again well-crafted it works when someone says can I get a LISPR refill a good job of answering specific questions like how long till my shingles rash goes away patient message quality matters here's one where no chance for gen AI to help with this one my family all has COVID I'm starting a sore throat do I test how long are they contagious I just remembered I'm almost out of my GERD meds can't afford to be sick right now there's no way it's gonna have a coherent message this is also why our jobs never be replaced by gen AI fully is because patients are data sources and it needs it needs a human filter and it is not even close to being good at that here's this point I made before about empathy versus authenticity it does sound more polite than providers but like that maybe that's empathetic I'm a little skeptical about using using that word but it definitely strikes patients as inauthentic if they have a 25-year relationship with their endocrinologist and they know what they sound like in person on the phone maybe through patient messages and suddenly they sound like someone else with flowery speech what's interesting here's where it's starting to get interesting nurses plus AI are starting to siphon messages off of the pools more because they are sort of equipped and maybe doctors answer fewer questions now we're starting to get good results there's not a good there's a questionable opportunity for medical assistants to use these tools okay we talk about some non AI stuff that we've learned because this matters too so here's what else we've learned when I say we I don't just mean pen I mean the literature out there on one of the largest applications of gen AI and clinical practice again using chat GPT to answer patient messages number one non AI workflows matter so if you have an in basket or inbox or message routing or even phone call just like 25 years ago patients had phone calls you probably had algorithms where do all phone calls need to go to the doctor if someone's calling to say where do I park I'm a new patient does that really need to go to the attending no like so there were certain decision trees where not all patient questions need to be routed so that kind of work can actually have major impact on this sore spot there's almost little to no value proposition to attendings for making diagnosis diagnoses or answering our questions none of the docs almost ever use a generative if someone had a like an educated patient just came out of hospital and like all their insulin prescriptions were all wrong and they had all these sophisticated questions and there's genuine clinical harm in getting this wrong there is perhaps a rehospitalization that's avoided if you get it right gen AI is not going to sell it that and that should be sort of like a doctor level question that kind of stuff Jenna Jen AI is not helping them but if it's if they're asking simple questions again it might help another thing you know one site lead offered everyone to withdraw from this pilot none of them did here's a disappointing thing you might be familiar with ability to it's called prompt engineering as others have heard talked about to modify the production of the model it's not as effective as we had hoped a rag would be that's a future technology you might hear about okay providers with our background as well as experienced nurses were successful and safely use this ma's were not this is almost universal now there's a few places if you look the literate look at the literature even in the news about who who in the in the health care system because arguably could scale to everyone who can it be safely done most can but we found this is where it gets not scary but I'll just say it anyway a little scary that some of the messages that the model generated were so good-sounding so empathetic so seemingly accurate that ma's and looked at it and said perfect I'm sending it right back to the patient and they weren't catching mistakes they work these mistakes were caught by nurses they're definitely caught by the attendings they were caught by a PPS they're caught by even residents and not some of them a so for example remember that that case where someone said can I get my blood test ahead of time so when I meet with my endocrine doc I can go over the results and then Jenny I basically said I'll give you a prescription during the visit we'll get the results during the visit we'll talk about it during our visit that that might have been unfortunately sent directly without any editing to the patient if this tools in the wrong hand so that's kind of how it's unsafe as it's like that you gotta have the right people okay this is a busy side but this is I think where it gets even more interesting to say well who might use it what is the character so what is the phenotype of the person who might use this this is the summary data at Penn and we're not unique in this we'll be publishing this soon on greater than a year of use and I know it's busy and it's all over the place and that's actually the point each circle here is a person is the pilot or who's been using it the size of the circles how many generate drafter messages they've had and if you look at that x-axis it just shows how much they're using or not so way left they're not into it it's not their jam that's okay low less than 10% of usage rate far right we have a lot of usage again with my silly cat example unused messages can still be valuable because they give you an idea of what you might say you don't have to do it but they some people said they that was their favorite part look in particular in the middle row it's called pen family medicine Voorhees if you can see that that middle like right smack in the hair there's four circles red green like yellow red again these are four attendings three of them don't even use it and one guy loves it and what's interesting is the person loves it that's not the most technical person of the group by the way it's just it's it in his workflow he actually said I'm desperate desperate about relief for these pay messages it's not that I think Jenna is cool no I don't think it's cool at all but I'm desperate and drowning in messages so this is a lifeline to me which is interesting actually okay you can't exactly predict who's going to use it you could predict that maybe medical assistants want to be the best but if everyone's a dog there's some like it some don't you know this is another way of looking at this so the proportion of you these are three different group like people who use it a lot people who kind of use in people who almost never use it and and so there's about a quarter 20 you know a fifth to a quarter of the people use 50 to 100% of messages most somewhere in the middle and then back by the way I want to go back to this I forgot to point out one more thing about this this dotted black line in the middle is our average and it's not a bell curve that was one of the points it doesn't like kind of cluster around the mean it's uniform and it's distribution look like like there there is there's little not even that many people close to the average so you can't this was a surprise we thought okay so maybe 35% of messages are going to be used that's as a whole but within one practice it could vary greatly okay almost finished wrapping up here I want to just show you some how people change this and also how impressive it can be so this is sort of how I said there's promise to this but there's a lot to be desired this is you can measure what's called a Levenstein distance which is about how much editing that gets done so usage rate alone is not necessarily good but sometimes is how much between what the what what the gen AI drafted and what the final thing sent to the patient was if it's completely different you might think well that wasn't that helpful and if it's almost the same you'd say that one was probably a useful message so for example on the left column I'm not gonna read the whole thing but AI generated a message that says I'll just read the beginning dear patient I'm really sorry to hear that you're feeling so unwell your symptoms sound severe it's important you get medical attention as soon as possible please go to the ER the final messages in the final sentence in the light gray font is was added this was where some of the prompt engineering was helpful we sort of added this this message was auto generating carefully reviewed and revised and accuracy the exact thing everything was accurate according to this experienced nurse whose only change was to delete that final sentence and send it to the patient which is fine by the way you don't have to have this caveat because there's a human in the loop she opted not to keep it so that would be arguably a good way to offer whereas this one is a vitamin D question you don't need to read everything but the red font on the right was added by the human and at that point you might think that's so much I almost would want to start from scratch but it's your call you know it's sort of like your call this is what's nice about this is like if you build it in your workflow you can do it or or not okay here's gonna scale up to the big picture is there a better future for this I again I give this no B minus C plus maybe you guys some of you think this is way worse why is he saying that or some of you say I'm traveling I'm desperate I'd give it a higher grade it's some it's not that great okay at scale but what's interesting is Gen I improves with fine-tuning and cost decreases incredibly and here's just different approaches at different organizations chop which is right next door my to where I work chop is actually scaling to all ambulatory care enterprise of all roles and it's not that it's so good it's that they're so overwhelmed by these messages and it's worth it here's some of the bigger publications on this that this this was an article where it said where the question was was the adoption of and clinician experience with clinical practice deployment of this large language model deployment how'd it go basically and the finding these findings suggest that the use of large language models in clinical workflows was spontaneously adopted usable and improving clinical well-being did we see a numbers remembering this is 50 seconds on average did that go down they did not observe that but there's different ways to quantify value it would be nice to see that though and here's another one that Mayo Clinic is doing it for all nurses because their workflow, again, workflow, workflow, workflow, their workflow is most patient messages are handled by nurses. And so they're deploying it not to all the docs, but to all the nurses. And so that's what's considered valuable. So this is like, this is new. I'm not giving you data from like five years ago, which is fine, you know, but this is like of this year. And there's a lot of stuff coming out. If you pay attention to some of our top journals, the Antiquated Societies, publications on this, Wengland Journal, JAMA, things like that. There is a hunger for this. So you're gonna see more things coming out about this. So some places are scaling. Some places have turned this off. I have colleagues at MGH who sort of colleagues CMIO at Harvard who said, we tried it. We got like a 6% usage rate. We're probably gonna turn it off. It doesn't work everywhere though. And that's really what I have. So I'm happy to answer questions. You can ask me about what, you know, it can be Gen AI focus. It can be about this or even other stuff. So thank you, Dr. Moon, for a really fantastic talk and a great way to end the day. So practical to all of our day-to-day lives and thinking about the burden of messaging with our patients. So I'm Suman Srinivasan, I'm an endocrinologist at Mass General and I'll be helping moderate the Q&A today. So we have some comments and questions coming in. I encourage others to please, please put your questions in the chat for Dr. Moon. So our first comment question is from Freddie and he says, our system requires every patient to sign up for the messaging system. Therefore we receive so many messages. We created the problem on day one. Is there a law that mandates the messaging system? Not at all. We created the problem, Freddie. There is no law necessitating it. There are some state laws that say your results need to be immediately available. But what is immediately available is subject for debate by honestly attorneys and not sort of folks like us. And that was the original creation of like, oh, you can log on and see your results. Then messages went wild. It's kind of like BPAs and pop-ups and Cerner has versions of this and Epic has this as well. These are, they were supposed to be few and now there's far too many. There is no law that mandates it, but there is the thought is, is the patients like it and you can message your doc after hours. So maybe that's good, but you know, there's incredibly, there are some patients, maybe you guys have this, who message their doctors every single day. Isn't that unbelievable? Even if they're fine, they're just like, hey, today was a good day. My sugars have been between like 90 and 110. You're kind of like, great. Like, you're not gonna say, you know, but like, can you believe that? Like every day, so hundreds. So generally it was created with good intent, but it's probably overdone. It's millions at one institution. All right, and then Amira says, thank you so much for a really nice talk. Even if the question is simple to answer, it correctly needs to know the patient's clinical background. How will AI address this? This is a good, great, this is a great point because you think if someone said, I got my test for my hemoglobin A1C is 12.0. I Googled that, that's not good. Is that where I normally am? For this specific application, it does not look to the detailed data points of the patient. That's like a big failure on that part. The thing is, is the technology could, like you could say, I want you to look at the patient's information right now, but generally, so this application, it does not. So it's just very like general questions, but then it's usefulness is markedly diluted. So AI, to answer your question, how does AI address this? AI can, and I think it will, but it does not in its current sort of watered down state. I guess as a follow-up sort of comment question to that, do you, you had mentioned this is really being used now for just more simple questions. Do you envision that this will be used for more complex questions that may need the patient's clinical background and what needs to be done to sort of refine these algorithms and the machine learning that needs to take place? So two things, a great question, right? Like how is it really gonna be helpful? Number one, it's about law, it's not even clinical practice. These models, what if it gave bad advice that then hit the patient? Who's responsible? Like bad advice with a bad clinical outcome. Are we medically responsible? If I, if Dr. Moon said a message and patient harm happened, could I say, no, no, no, the model generated, it's not my fault. So how do you indemnify yourself on this? It's unclear right now who is, we do know in all circumstances, the doctor's on the hook, but is the doctor on the hook and the software companies? You know, like if you would think, and it's not a clear answer because if an orthopedic surgeon puts in what he thinks or she thinks is a sterile device in someone and they get an infection, he could say it's not my fault, I assumed this was safe and I knew how to use it and it's not my, and device companies can and have been successfully sued if they actually had like, I'm simplifying, but like a contaminated hardware that was implanted in a patient. So similarly, so like what, like up to what point, how good does it need to be to say it's medically legally on the hook or our organizations have to say, you're indemnifying yourself, it's all on us. And then are we even at increased risk because we are, we might be a little bit biased as to answer, like I said, cognitive offloading. So, so that's it. So, so how do we get it better though? There's a, again, there's a, there's a technical term called RAG. RAG is an acronym where you would say when the patient is asking about their hemoglobin, or they're asking questions, I want you to disproportionate look at the endocrine texts, endocrine journal articles that come of, that come out of the endocrine society. And so you deliberately wait the response of the model to be better sources of truth than the whole internet, because the whole internet might have some weird inaccurate stuff or watered down stuff. So if you can say, I only want you to, the model's probably gonna have a better response because it doesn't think, it just looks at a body of texts and generates what's most probable. And if it only looked at evidence-based data, then it probably is gonna generate better stuff. This is how it's gonna probably get better in the future. Definitely a lot of ethical and legal implications going forth, but Usai says, thank you for sharing the real challenge, like your friend being burnt out. The information technology industry is making a great financial, making financial progress. However, the system's really putting stress on nurses and physicians. Am I wrong that they are not doing enough to help providers cut out pajama times? I guess, meaning interpreting that maybe, is there more burden that accompanies us in off hours? And how do we address this dilemma? I will, I agree. Like your employee, like I'll put it more bluntly. Like I do believe your employer is partially to blame for this kind of thing, because it's like this, the patients, the customer, and I'm not a guy, I'm clinical, I'm not like a guy who calls patients customs, but this is how a lot of times you think. The customer likes patient messaging, and indeed there are times to do it right. Messaging your endocrinologist every day? Of course not, okay. So the customer likes the thing, and we're just expected to go along with it, but we don't get reimbursed for it, and they go, well, my RVUs per hour aren't increasing, these are not new patient visits, and all that kind of stuff. So all the metrics I get held to don't really, like what about answering 50 messages a week? Shouldn't that count for like a point, or however your metrics work? I fully agree with you. The thing is, is this patient messaging has exponentially taken off, and a lot of administrators did not even recognize how much it was. The EMRs have buried how much messaging occurs. It is not at your fingertips like you think it would, and it's not in like a dashboard that operations people see as much, and I think they're starting to see, oh my, there's way more messages. It's like one of those questions, asking them, do you think there's 100,000 messages a year, or is there a million? Is there 5 million? And my experience as a CMIO, because I'm kind of in the world of both like healthcare system leadership, but like technology boots on the ground people as well, they're more clueless about how much it is, and so you say, am I wrong that they are not doing enough to help providers? You're right, I don't think they've recognized how much work and burdensome it is, and one of it is to say, well, the doctor shouldn't do it, push it on nursing, but that's more higher FTEs, and they don't understand the ROI for that kind of stuff. It's not really about that, so that's why there is perhaps an interest in technology trying to solve this, when actually there probably needs to be sort of better workflows for that kind of stuff. And Usai follows up and just says, for the endocrinologist, as you alluded to, to answer some questions, we end up spending a lot of time reviewing these journals and such. Yeah, and I mean, like, I know, I mean, I'm preaching to the choir, but I'm with you, this is why we hopefully would have leveraged, like technology created this beast, and hopefully we can solve it, like, can you guys imagine, have you ever met a professional attorney who's not like a friend or a colleague or whatever, who works for free for your time, like they have professional expertise, and you ask them questions, a financial advisor, even an accountant, they charge. If you're asking about your, an endocrinologist, like that's very high level training that endocrinologists do, this should be an hourly pay, other industries do it, and I mean, but we don't, it's medicine stuff this way, and I hope, believe me, I hope Gen AI and others, like, it's not that people are excited about Gen AI, it's that they're desperate and drowning with this kind of stuff, and maybe it can help, maybe. Just another question, so we've been talking about sort of the provider perspective, and just going back to the patient perspective again, you talked about this balance of the messages being very empathetic versus like what the provider would do, but also there may be some inauthenticity with the messages, so what do we know about sort of patient trust in these messages, and do we need to enhance trust, and then second to that, do these AI-generated messages, does this make a patient more likely to use the system or less likely to use the system? Yeah, great question, so patients do, I'll tell you what they don't like, I don't know that all the answers is because I wanna be as like evidence-based as we can, and this is like, still like a, almost like a pilot, okay, patients who get a message that's very flowery speech, and let's say they know they have a Harvard doctor, Sreenivasa, brilliant endocrinologist, and that's who they want to message, and instead, they get a message that says, this was generated by Gen AI, they don't like that, that we know, patients are, they don't want that, they feel like they're talking to a chatbot, and especially, it's not just like trying to do an Amazon return, this is their health, and so they think that these big organizations can do better than that, so organizations that are stuck with, do we do it anyway, like CHOP is, CHOP is just going for it because they're saying, there's too much patient messages, leadership is on your side, this is a pushback, by keeping a human in the loop, you actually edit this, and you can remove sort of that disclaimer, but do they know, if the disclaimer is removed, and in that one example I showed, there was a 98% overlap, patients are unaware, they don't even know, but it's been reviewed, but one of the arguments I've heard is that, back in the day, a blood draw was done by a physician, this was a long time ago, a blood draw was done by a physician and looked under a microscope and you counted the number of white blood cells, and that's how you know they had leukocytosis, we're so far from that, it's an automated machine, in fact it's more accurate than I ever would be, or any of us would be, patients, what used to be a manual human process is now done with a machine, and it's thought that that's sort of occurring like this too, it's different because it's actual like pretending to be a human versus automating and quantifying red cells, white cells, plants, et cetera, so they're honestly not fully aware. What was the second question? So how do patients feel about it? They don't like- Yeah, are they more likely or less likely to, are they more likely or less likely to use it based on that? So we've wondered if it's gonna increase phone calls, that's what the, is that they go, oh, I gotta talk to chatbot, I'm calling the office then, but then it's like, welcome to Penn Medicine, to schedule your COVID shot and then you're on hold for 10 minutes, so that's a bad experience too. So I'm not sure what the effect is, but we do know that removing the disclaimer makes it invisible to patients and we don't know what else they do with it. The inauthentic one, by the way, here's a story about this, a very terse old school young gentleman physician in his late 60s when he, he actually had a patient say, is he on drugs? He never types like that, with all exclamation points, like they said it's his, the patient who's known the physician for a long time said, is he on drugs? Or is his granddaughter answering the questions for him with all those exclamation points? He never talks like that or writes like that, so the patients could tell. And so he brought that up and said, I don't want patients thinking that about me. And what about sort of, a lot of us use dictation devices for like our clinical notes is there any move that, to use this AI sort of application and clinical notes as well, aside from messaging? So you've probably heard about ambient listening, which is a, which I know if you're, that's a very vendor led thing, which is, you use your cell, you just say, I'm not gonna type on the computer or anything like that. And I don't have a human scribe with me, but my phone scribes our conversation. There's some of that, that honestly, my personal opinion is that the vendors are overstating how good and valuable it is. And then in real, it's like the, I heard someone describe once to the promising home runs, but they're really like singles and doubles. Where it actually is helping doctors in a more welcome way I've seen is where if you write your note and you're about to sign it, imagine sort of like spell check in word or something like that. And it goes like, you spelled a word incorrectly and you can hover over and choose the right thing. What they're doing is that if you embed billing, sort of like billing knowledge in a large language model, it can look at your note and say like, you said they have uncontrolled diabetes, but if you specified more details about it, you can actually increase the RVUs of the case. And it's a gentle nudge. It's not a BPA, it doesn't stop you what you're doing. It's not like an alert, but like you can kind of hover over and very quickly improve the quality of your documentation so that you make more RVUs. It's stuff you already do, but you just didn't feel like typing. That honestly is very good because there's no like clinical risk. There's no patient harm. Patients are like unaware of it. And it actually stretches out your acronyms and things like that. That's pretty good, actually. I'm optimistic about billing nudges. So if you see a company that says we can help you with billing nudges, RAI, I actually think there's probably like good to see about that because it's hard to memorize that stuff, right? Every year CMS changes rules. Software could like offload you on some of that. Anything to help reduce burden. All right, well, I think that concludes. Oh, sorry. Oh, one more question. One more. Let's see. Oh, Usai is commenting. We need AI to keep track of our times, including researching the patient data and literature reviews. So I guess- Summarizing. Yeah, summarizing. I fully agree with that. It's incredible when done right. Like, I'm gonna just make this. Here's my favorite top 10 journals. I don't have time to read December, 2024. Can you condense that to a summary, endocrine only for me to read? And if I wanna read more, you just jump me to like the article itself. Like you click on it and it like moves you. Like that it's gonna do well because it's good about summarizing texts. And that's what you're saying. I agree with you. I think that's a great application. I haven't fully seen that yet, but imagine you customize it to your own interests. Yeah, good call. Yeah, it's really great to see where the field may be heading. So thank you again, Dr. Moon. That was a really informative talk. Really great to see all the pilot data. And we look forward to seeing where this takes us. And so we'll conclude our session today. Thank you everyone for joining us at the virtual summit today.
Video Summary
Dr. Jeffrey Moon discusses the potential of generative AI in managing patient messages, highlighting its potential to alleviate the burden faced by physicians. He shares a personal anecdote about a dermatologist friend who left clinical practice due to exhaustion from handling these messages. At institutions like the University of Pennsylvania, where Dr. Moon serves as an assistant chief medical information officer, patient messages have surged, with over 6 million received by 2023. Generative AI offers a solution by drafting responses to patient inquiries, which are then reviewed by healthcare professionals.<br /><br />Dr. Moon notes that while AI can address simple queries, its responses can sometimes be inauthentic or verbose. The technology is compared to early iterations of software like Microsoft Word, emphasizing that generative AI is still evolving. Trials of this technology at Penn showed that it helped partially ease the workload, but the differences in adoption across various departments suggest a mixed reception among healthcare providers.<br /><br />Patients' reactions to AI-generated messages, particularly concerning authenticity and trust, remain a concern. Dr. Moon anticipates that as AI continues to develop and integrate with healthcare systems, it will better address complex clinical queries by utilizing domain-specific knowledge sources.
Asset Subtitle
Jeffrey Moon, MD MPH
Assistant Chief Medical Information Officer,
University of Pennsylvania
Keywords
generative AI
patient messages
healthcare
physician burden
University of Pennsylvania
AI adoption
authenticity
clinical practice
healthcare technology
EndoCareers
|
Contact Us
|
Privacy Policy
|
Terms of Use
CONNECT WITH US
© 2021 Copyright Endocrine Society. All rights reserved.
2055 L Street NW, Suite 600 | Washington, DC 20036
202.971.3636 | 888.363.6274
×