false
zh-CN,zh-TW,en,fr,de,hi,ja,ko,pt,es
Catalog
AI in Healthcare Virtual Summit Session Recordings
AI for Healthcare in Action
AI for Healthcare in Action
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Okay, hello everyone, good morning. Gabriela, please, can you confirm you can hear me clearly? Yes. Perfect. Hello everyone, good morning. My name is Wura Ola Oyewusi. Welcome to the AI and Healthcare Virtual Summit. And I'm delighted to have you in my session. In this session, we are going to be talking about AI for healthcare in action. What you need to know as a decision maker. To give you a context of this topic, I should mention that my background is in pharmacy and I began my career as a clinical pharmacist before delving into data science and AI and research. How we are thinking about this session is that everybody is talking about AI, some of the ways you can get started, some of the ways it's possible. So as clinicians, there's a part where do you want to become a coder? That means you go all the way, you learn how to code. You learn how to implement your own AI models. You learn how to, you know, whatever you're thinking of that you think it's possible with machine learning and AI, you want to do that. But how we are thinking about this session is that your role as a decision maker, you're a clinician, you're somewhere in the health ecosystem. And while the list of all the ways that you can interact with AI systems, machine learning powered systems, while the list is exhaustive, in this discussion, in this webinar, we are going to be focusing on three key ways. We have the end users. In the context of the end users, that means there's been a tool, the decision has been made and it has come to you, whether in clinic, for research, they said this tool is powered by machine learning, is powered by AI. So you're an end user there, or maybe you teach in college and they said, these are some of the AI tools that we want to use now. And so in that instance, you're an end user and your role is pivotal as an end user. And then there's a second part where you're a domain expert. That means that for some reason, maybe some of your friends, they're working on those big AI projects. Maybe they are computer scientists, maybe they're just geeks, maybe it's just a company, but your role there is that you've been invited as someone with clinical background, as a clinician, as an expert, to come and do this work with them. And then the third is AI governance leaders. That means that this is at the place of decision-making. For example, you're on a panel or you're a part of an IRB panel for a particular product, or you work with the FDA, you work in regulatory, and now you have to make decisions about AI systems. So while the list of how you can interact with AI system is exhaustive, we are considering it from those three lights. My desire as we are having this webinar is that we interact, we have great conversations, and you're more aware of all the ways you're already in the AI pipeline. Some people really want to be there, some people do not want to be there, but the truth is that the world has changed, and now that we are here, what do we do? There are three key ways to think about AI and machine learning. While there's the rave, where it feels like this is magic, this has never been done before, we've done it with AI. When you're thinking about AI systems, when you have to make decisions about AI machine learning system, I want you to think of them as just part of software. In fact, from my experience, teams who think of AI as something magical, they don't tend to do as much as people who realize that this is part of software. So this is part of software. It's more complicated because now the variables are a lot. It could be a generated AI pipeline and all that, but the core of it, teams that succeed in implementing AI, whichever angle you're coming from, are people who know and treat it as part of software. And again, what I mean about part of software is that for the mere fact something is AI, it will not remove all the requirements and all the typical maintenance of a software system. That means you need to deploy, you need to put your model somewhere, they need to be on servers. In production, you have issues like your model degrading. That means that initially when you trained it, it was working well. The evaluation matrix, it's passed. And then about six months later, it's not that optimal. And that's peculiar to AI systems. And what happens is that if there's a shifting pattern of the data coming in, it also changes the behavior of the model. So let's say your hospital or your research center, you are implementing software. I would like to prepare your mind that these are some of the things that happens. And of course, there are machine learning ops system. There are systems put into place for things like this. But in summary, one of the key ways to think about AI, it's part of software. And then learning algorithms. If we say there are learning algorithms, that means there are algorithms that are not learners. And this is one of the most interesting things about AI. Again, there is the hype, but the truth is that there are truly some interesting things going on. So with learning algorithms, typically it's divided into like three. There's the supervised, where you have to label your training data. And this is very common in healthcare, because except maybe your clustering or just some data analysis, you want your model to learn from something real and genuine. And this is another way clinicians come in. And that's where domain experts come in. No matter how smart people in computing are, it's not like they can figure out what is out of range, what does not work. But again, learning algorithms are one of the reasons that AI systems, that they are interesting. And the fastest way to a summary of learning algorithms is that we have a subset of data. We present the end result and the training data. We put machine learning algorithms in between. It's because of the formula itself. We are not the one asking it to, we're not the one. It's just because of showing it enough data, it can learn the pattern that every time, many times that we have seen this type of imputes, this is the type of output that it gave. And this is the computation in between it. And it is really interesting because with non-learning algorithms, you have to instruct the computer each step of the way exactly what to do. And then the third thing, why AI is suddenly all over the place. I know this is not a deep history, but there was a time that there was something called the AI winter, where people have been talking about AI, people have been talking about things, but then there wasn't a lot of research or breakthrough going on. If you've been following the news now, something could come out on Monday by big tech company. Before the end of the day, there's another tool that has come out that is also AI. Why is this happening? We have data and compute. As some of the things that have also changed from the pandemic, we have more online meetings. I'm speaking to you from Manchester in the United Kingdom. Maybe previously I would have had to make a trip, but now as we are in this meeting, we are generating data. So let's say this platform is looking for data on people speaking in a webinar. We just have more data. We have more tools. And I think this is personal, but I think people are also more comfortable interacting with AI systems. People are using chatbots even in their group chats. I remember just about five or six years ago when some of those generative AI tools, when you didn't have an interface for maybe people who are not co-experts to try it. I remember I also used to wonder, do people want an AI system on their phone? But interestingly, people do want the AI system on their phone. People are talking to it. People are willingly giving data. And we are definitely online. People are tweeting. So we just have much more data than we used to have. And then compute is cheaper. I'm gonna give an example of what I mean by compute is cheaper. If you were to set up, let's say a GPU. So typically on your laptop, on most people's laptop, you have a central processing unit. For some of those AI algorithms, machine learning algorithms, let's say we are running a code, or maybe people who game. People who game are able to see what GPUs can do. Like with large images, people who game that have GPU on their system, it renders faster. It's not stopping the operation. Because we have more GPU accessible, people can do just much more faster. For example, if I were setting up a local GPU, with maybe my computer, it would take a while to set up. But now we have tools that have virtual GPU support that I just need to say, oh, go to this website, click the runtime, and I have some GPU to run my computation. So again, I want us to have this baseline about AI system. It's part of software. Learning algorithms are real, and they are interesting, and they are part of why we can do much more. And then there is access to data and compute. I should also let you know for people who work in research, that the more volume of your data, it's true that you need a bit more compute, but many times for iterations, for basic research, and some of the interesting things due to learning algorithms, you can do much more, even without big computing facilities. So as an end user, I have explained that as an end user, these are some key consideration for what I think you may encounter. As an end user, you want to know if your tool is useful. Many times about AI systems, they are usually concepts that say, oh, maybe we should ban students that you cannot use any generative AI tool. The truth about useful tools is that people will use them. So that means even if you ask all clinical staff that do not use, I'm trying not to mention any particular tool, but let's say we want to ban all else users that do not use this particular AI chatbot, people will use them because it's useful. So that's why the conversation must shift from we think this may be evil. We think this may be inaccurate, which is true. So how can we make this work for us? So if a tool is useful, people are going to use it. Also as an end user, there's usability. I was thinking about the difference between a tool being useful and usable. One of the interesting things about AI systems is also usability, in the sense that something can work in a context and not particularly work for your context. One of the ways to explain that, for example, there are large language models that are trained on English language. We know what they do. We know that if you're trying to do auto-completion, they can do that for you. But remember they are trained on English language data. If what you're trying to do is another language and you've not fine-tuned on that language, it's simply not usable. Even though you know it works, you know it's just numbers, it's just algorithms, but if you're not fine-tuning for your use case, it's not usable. And then the third one, which is an important one in healthcare is reliability. And I must say that one of the interesting things about, healthcare has an interesting way of approaching matters. And I think, well, maybe I'm biased, but I think healthcare is miles ahead. Obviously it's people's life, is miles ahead in determining the reliability of tools, reliability of medication before giving it to people. So now we're going to have two hypothetical use cases. And I want to ask the audience, if you can type in the chat window. Let's say we want to use an AI system to highlight and extract key laboratory data from unstructured clinical notes. I should give context to what unstructured means here. Structured data many times is data that are in tables and they are numbers. That means that AI algorithms can work directly with numbers and things like that. So let's say we have clinical notes and we have this big research we want to do, maybe about diabetes, and we want to extract laboratory values. I should mention that if we have just 10, if we have just 10 or 100 or even 1000 notes, it may not be a big deal. You know, you could ask the lab assistant 200 each, but let's say we have 100,000, 1 million notes, and we want to extract the laboratory values from each of them. And we want to use, we've heard all about these tools, like CHAT GPT, Gemini Copilot. Let's say that is what we are using in this case as an end user. What are the important questions you think we should be asking? So I'm gonna try to switch to the, I'm gonna switch to the live view and see what you have to say. So our first hypothetical use case, we want to extract laboratory values from 100,000 medical notes using some of the readily available generative AI tools. What are the important questions we should be asking as an end user in this case? Hopefully, there are answers in the chat window. Okay, do we have any answer? Does anyone want to say, let's say they say we want to extract the laboratory values from like 100,000 medical notes and want to use generative AI tools. What type of questions would you have? As a clinician. Lena says GFR. Okay, please can you write it in full? It seemed like it's an abbreviation. Okay, Nikita thinks we may need to look for the hemoglobin A1c. Thank you. These are some of the values that we may be looking to extract. Okay, okay. Oh, okay, GF. Okay, thank you, Lena. Okay, okay. Lena said glomerular filtration rate, hemoglobin A1c. So these are interesting laboratory values. Okay, now that we know some lab values that we may be trying to extract from several case notes, as a clinician, what should be, let's say someone else said, we think we have to use this generative AI tools. What are the things you will be skeptical about? Like about, now we are talking about the tool, not what we are trying to extract. What are the things you will be skeptical about? What are the things you will need to be proven to you that this works? Okay, we've answered some of the things we think may be extracted. Yes, thank you, Nikita. Nikita says hallucination made of data, which is real. I'm hoping we have two more responses and then we can go to the next use case. Thank you, Lena and Nikita. You are carrying this session. I don't feel alone. Okay, Lena says ability of AI to account for the length of diabetes in each individual. Is it like how long they have diabetes? Okay, thank you, Lena. Usha, you said what they can do that I don't, I'm not clear about that. Usha, if you would like to write that again, I'll find it interesting. Okay, so in this false alarms, thank you, Igo, because these are valid questions. These are valid questions. And then let's say another use case, which well, I think it's similar to this. So let's say we are using AI to support research writing. Some of your research assistants, they do think that we want to summarize some literature. We think they are long. And then again, I understand that in some system, they will say, do not touch generative AI tools. Do not touch the tools. It's just gonna make up false things for you. I think the things you've mentioned may also apply to the second use case, but all the questions, all the, we have to look for the button on the software to enter data. Okay, so you're talking about, oh, well, I don't even know if this particular software that you're talking about, Usha. Oh, I get you. Usha, I think I get you. You're trying to insinuate that even some of these tools are not exactly making the jobs faster. Is that what you mean? If that's what you mean, you can reply in the chat window. And I think your instance may be relevant to another use case that we will talk about again. Let me just summarize. So we have an hypothetical use case. I think this is interesting, Igor. You said it's difficult to see which part of the, what parts in the text are not AI generated. It's true. It's true. It can, I'm gonna talk about synthetic data, I think in another use case. You're right about that, Igor. So let me just summarize. Yes, Igor, we should. I'm not sure I'm pronouncing your name well. Thank you, Usha. A.I. people will like this. A.I. people will like this because if you are pitching the products, you want to talk about how you to save time. So we have established that, let's say we want to extract laboratory values like GFR, like A1C from a lot of text nodes. Because to be fair, it is just again, between 10 and 100, you know, just so people can sit down and get it. But let's say you have a really big data set. Which is real in practice. So let's say you're making this study, maybe a longitudinal study across several time. It's really, you know, it's one of the places A.I. shine. So I was thinking about it. And then people raise questions about, okay, let's even say people talk about hallucination, which is true. And the interesting thing about hallucination is that hallucination in that instances that many A.I. systems also tend to sound confident. So it's not like, it's going to tell you like, I'm not sure about this. It's just going to, many times it will generate. Before some of the notes that start saying that, you know, this data you are generating in this instance, you know, I'm not sure. And this is one of the really good things about the industry that you are in, healthcare and all that. As much as people, we need people. There's a lot of, many A.I. systems that work, there is human in the loop. So let's just say part of this discussion is me preparing you for, just in case you are the human in the loop at the particular time. So some of the questions that I think you should be asking is what is the clinical impact of using this tool? Remember I mentioned that this tool is just a general tool. We are not sure what the data is trained on. We are not sure of the data is trained on. I will also mention some of the ways people are mitigating this. You know, for example, I was working on a data set. For example, I was working on a data set. First, I started with the general models designed for English language. I should mention to you that there's quite a lot of research going on with A.I. and healthcare. And then some of my accuracy was just about 0.4, 0.5. It makes sense for me to iterate over all those things. And then finally, I found a model that was trained in the context of what I was looking for. So let's say I was working on data related to diabetes. I found a model that was trained on diabetic data and my accuracy doubled from the 0.4 that I was getting with the general model. And then I started to get about 0.8. And it's just what learning it's like and what context is like. So in many systems, even though it's not like it's always disclosed the model pairing something, but in that pipeline, typically you should be thinking of clinical impact and how relevant the training data is. And then of course, our key duty as clinicians, whatever is our patient safety. And then the third one, that's why I like the point someone said, it's difficult to enter data on a particular tool. What's the workflow integration of this A.I. tool? If we say it's A.I., it's supposed to make your job faster and it's not well integrated into the pipeline. So as a decision maker in this instance, even if you're not an A.I. expert, you can ask questions about this is our tool, how does this tool fit into our workflow? Is it gonna make our work longer just because we are trying to prove a point? Because sometimes we want to prove a point and there was the training and adoption. Like sometimes a tool has a learning curve that is difficult. So what provision do we have for training and adoption? So now we want to be able to use some A.I. systems, the charts boards and A.I. model behind our software to solve this problem. How ready are we as a team? So these are specific questions I think you should be asking when these decisions come to you, even if you're not an A.I. person. Like what is the clinical impact of this tool that you want us to use? What's the safety of our patients? For example, in recognizing the laboratory value from clinical notes, do you just want to copy and paste your clinical notes into an external tool? Or do we need an internal tool that works specifically for us with our own interface where the data is not being sent anywhere outside that system? So rather than just say we do not want anything to do with A.I. and machine learning, if it's gonna improve our workflow, what if we have a local model? Because that's usually what you do, something such that the model you're using is not returning data or any inference back to an external system. You have an in-house system working on your data. Another factor is that sometimes you use placeholders. So for example, all the PII, personal identifiable information, in all those case notes, you've run it maybe first through a pipeline that remove all the things that are incriminating. And then you can just look for the lab values using that tool. So I just want to let you know that even as an end user, in the way you give your feedback, in the way you interact with this system, these are some of the ways you can come in. Because again, the truth is that these systems are here. Some of them, you may not even want them personally. It could be your employers trying to maximize systems. And so these are key questions I think we should be asking to make this work. And there is no one that can give this type of feedback than the clinical end users. So the second way I know we come in is domain expertise. This time around, we've heard that you're a clinician. We've heard that you support this type of workflow. We've heard that you administer this type of treatment. We do think that we need the expertise to be able to build a useful product. Teams that are serious about building a useful product, they usually have domain experts. Because the truth is that for many people who do not work in health care, it's tempting to just treat data as a bunch of numbers. Let's change all this. For example, I have an instance of, before AI techniques got, before there was more capacity, we used to have more things like cleaning your data. And in cleaning your data, you want to sometimes remove punctuation. You want to, if you're working with English, you want to reduce all words to their base form. So for example, if in a text we have written, we have wrote, we have write, you can bring all of them to write. And in all those ways, as we are pre-processing this data, in some of the techniques, it loses context. So for example, let's say we are working with, let's say we're working, I don't know if it's popular with you, or we are administering potassium to this patient and we are giving slow K. If I create the pre-processing pipeline and said that remove all single alphabets, which works well for all other system, but in this system, if we remove all single alphabets and maybe sometimes someone has written calcium as CA and things like that, or maybe you're using a model and your decision is that, sometimes people use brand names and clinical notes and things like that. If you say, oh, check this against a model that was not trained on the data like that, you're going to lose out a lot of context and important information. So this is why teams who are serious about doing great work, they usually need a domain expert. So to even be a domain expert on an AI project, you definitely do not need to be an AI expert, but some of the ways you can come in is in data quality. We've had so much progress in AI systems in this short time, and there is so much better compute capacity that the difference between models and the difference between their performance is as good as the data quality. Who knows how to bring the best data? And when we have that best data, how are we pre-processing it? Because it's a standard. If we decided that we're just going to use all those, like, oh, there are different ways to pre-process, that if we decide, I mean, from the computing side, that we're just going to follow all steps, we're going to lose out on a lot of context and a lot of important information. So if you're a domain expert on a project as a clinician, even if you don't know anything about AI, these are some of the key concentration that you should be asking about. What is the data quality? Where's this data from? What is the distribution of this data? Are we inclusive? You could know that a particular condition is more common in an age group, and then you want to adjust for things like that. And then what is the relevance of this data? You know, there could be a data crawl that this is what they want to use to train. It would be your job on that team as a domain expert to ask these important questions, because they do not know. Because sometimes if you're thinking from the clinical side, it's the same as, you should have thought of this, like, oh, why is this slow? You should have thought of this, that in practice, this is usually faster. But the truth is that the people designing the system, they do not know. And that's how they pay you and bring you on board as a domain expert. So you should ask questions about data quality, how relevant the data is, and the clinical context. I have used an example of sometimes when you're cleaning data, you can just say, remove all the single words or all the words you think do not have meaning. But those words actually have meaning in that context. They are not just maybe English that whoever is training the model is. So these are your key consideration as a domain expert on a project. You could be writing, you could be reviewing, and you do not have to be an AI expert to ask these important questions. And many times it's not malicious. Many teams who want to do important work, they are open to both pay you for your expertise and listen to your expertise. So we're gonna have another hypothetical use case. So let's say a team said they are developing a transcription tool for endocrinology, and it's supposed to be an AI-powered system that makes transcription faster, or they've developed a continuous glucose monitoring system that is supposed to analyze data just as the patient is there, nobody needs to check, nobody needs to do things like that. So I'm gonna go back to the live chat. What are the questions you should be asking anyone in this two hypothetical use case? I'll be happy if you contribute to this discussion. So let's say we have those hypothetical use case where we think that this transcription tool, I should mention that, for example, technologies such as pitch-to-text, they are not exactly new, but with AI, definitely they can do better. So now that they said we are designing this, a team has come to pitch to you, they think you should be part of their system, they think they want to build something important, what are the important things you think you should do as a domain expert? So I'll be happy to read your responses. And they also said data quality is very important. Thank you for agreeing with me, I didn't see this. If the data is inaccurate, AI will lead us on, yes. Because it's not magical, it's not thinking, it works with the data it's trained on. But I should mention that it does incredible work in helping us to train data at scale, in a truly massive scale, which is really interesting when it's working well. So does anybody have any imputes? If we have an hypothetical use case, we are developing a transcription system, a company approaches you, they do think you should come consult, they hear you work in endocrinology department and things like that. What will you tell them about building a good transcription tool for endocrinology or maybe a continuous glucose system? Do we have any imputes? If we don't have, we can just go to the next slide. Let me call the names of my people who I've been supporting. So Usha, do you have any imputes to these? Igo, Nikita. Okay, so Usha, you think, okay. So this is one of the first things we tell them. I have to push so many buttons on the computer to see the data. Okay. So I'm gonna try to describe what you think. Again, this is you just sharing what you think they should work on. So apart from the glucose continuous monitoring, rather than checking often, you think there should just be some text to speech telling you about what it's about. Do you think everybody will want to hear that on the word? I'm just thinking, I'm just asking. And then Igo say reliability. Yes, reliability is big. And these are questions you should be asking if you are the domain expert on the team. In fact, if you're the domain experts, I'm going to say you should assume that the clinical context they do not know. People do brilliant things with computers. They just do not have the clinical context and they're not bad people. If it's well interpreted for them that this is what we need, they can make the code work to what we need. Nadim, that's interesting. So you think, okay, Nadim, I'm trying to interpret your point. You said the role of continuous glucose monitoring personalizing nutrition. You do think that if this tool claim to be helping with precision, it should be able to personalize nutrition. If I'm getting you right, you can put it in the chat window. Yes, Nikita, that's an interesting point. Nikita says continuous glucose monitoring loop system must be safe, not causing life-threatening hypoglycemia. I agree with you because let's say the brief we are giving the company is that we want to monitor glucose and let us know when it's too low or when it's too high. If you're not clinical, you could think that because we are trying to reduce the value, that that means hypoglycemia is not dangerous. Again, clinical context. If you don't have the context, you may think that since we're trying to reduce, you know, we don't want the sugar to go high. But if you don't have a clinical context, you may also not know that we don't want it to be too low. And for the matter that we thought there is a surge, and that for the matter that we thought there is a surge, it doesn't mean people cannot, it's that it's not dangerous for it to be low. Okay, Raul said that I think the first step is to be able to transfer continuous glucose monitoring results, report and make them discrete data embedded into the ELT record. Yes, that's a good use case, Raul. It's a good use case of, and it doesn't even have to be the EMR. Because another thing, remember that one of the first things I said at the beginning, that this is part of software. And if we don't have a good integration, there will be silos. It doesn't mean the tool is not working, but if it's not, somehow going back to the main system where it can be used for decision-making. Yes, Nikita, yes. Yes, I'm happy you said this. Nikita says, transcription systems should understand domain-specific terminology and jargon. This is true. And this happens in real life. And another thing that sometimes, for example, I'm Nigerian. My accent is not like American. It's also a factor in designing systems like that. So it doesn't mean that you don't speak English well or you're not clear, but if the sample data used in training is not inclusive, it's not specific to, thank you, Nikita. I like the fact that you mentioned domain-specific because that's why I said a tool for endocrinology. So let's say you're the domain expert and we are building for endocrinology. At the beginning, remember, I don't know if it was Lisa, rather than glomerular filtration rate, she wrote GFR. If a tool is not trained on sample data like that, knowing that there is a lot of abbreviations, you'll be surprised how much it will miss out. There is no magic that is gonna fix that. The data in the training needs to be contextual. So if you're the domain expert on a team and this is what you've been invited to do, these are some of the most interesting ways we can come in, coverage for these terminologies and jargon. Yes, Igor, I will agree with you that, I will agree with you that if we say this is transcription, it has to be superior. Yes, yes. Remember I mentioned that there's been dictation systems. One of the ways it can be superior, this is where context matters. Let's say we care about this tool. You're leading the research in your hospital and things like that. So I agree, we have to make coverage for our context. We have to make coverage for the unique ways. Some of the shorthand in how we speak, we have to make that data diverse and encompassing so that it's not just our center. Oh, well, there are certain places that their tools are designed for their center. To be fair, if it works and they can control what goes on. Yes, and Igor, HIPAA compliance, we are going to talk more about compliance in the next. Okay, thank you everyone for contributing. I'm happy, I don't feel alone. I can't see you, but I can read in the chats. So thank you, thank you everyone for contributing. And then let me go to some of the things that I think you should be asking if you are the domain experts, data validation. Like this tool that you've mentioned that it works for endocrinology transcription. How was it validated? Again, like I said, as a domain experts coming from the healthcare angle, no matter what you're doing in the pipeline, if you're curious about this AI thing, a friend of a friend, you've been contacted on LinkedIn, you don't need to be an AI expert, so ask those questions. Some of the specific things is that this data validation, how reliable is this? How was it tested? How was it evaluated? And sometimes the interesting thing about working with AI with code is that sometimes you train a model, it works well, it has good accuracy, and then you test it on a new data and it disappoints you. I code, so, and that happens, that happens in practice for different reasons. Again, there could have been a tiny shift in the data distribution. And then you are going through an entire process again. You could train a big model, you could have spent time, it doesn't care about all the time you spent, all the money spent on transcription and all that. Many times, if it doesn't work, it doesn't work. And again, I'm gonna give it to healthcare. Healthcare is one of the fields that is really, I think we have quite a lot of things in healthcare, towards data validation, how reliable this tool is and why should we use this? So again, I think those questions should continue, whether it's AI or not. AI does not invalidate this. The second thing is clinical relevance. If, like Igor said in the discussion, if we have something that is not relevant to our context, again, it may be working, but it's not working for our context. We have the rights to, if you are the domain expert on a team, let me just say, it's gonna be your job to raise questions like this, because sometimes if you're working with code, I've been working with AI systems, machine learning for a while. Sometimes it's truly exciting. It's easy to say, oh, but sometimes what you found, what you're working with is truly exciting. But then your job as a domain expert, as the clinician, or as the clinical person is, how is this relevant in the clinical system? And then the third thing is model transparency. One of the things people have against AI systems is that sometimes something works and you don't know why it works. You do not know. It's just like some medications when we were learning pharmacology, that they said, ah, the mode of action is inconclusive. We don't know the exact pathway. We don't know the exact this. It happens with AI models too, that we don't know. And then sometimes the decision you're going to make is going to be based on, you could have a model that you can explain how it works. That's, for example, this is hypothetical. It has an accuracy of 90. I have a model with an accuracy of 93 that you don't know how it works. Many times it makes sense for you to go with the 90 that you know how it works, especially when it comes to AI in healthcare. Because in healthcare, they want to know if you're filing with the FDA, let's say you find that they have a product, they want to know what happened. And so it makes sense in that instance. So as the domain expert, you don't need to know the AI, but it makes sense for you to be asking questions about how was this validated? How transparent is this model? Which one should we choose? Are we willing to let go of some accuracy and efficiency for this explainability? These are the real life scenarios that we face as a domain expert that has been invited as a domain expert of an AI, or maybe you're building a small company, you do think AI is interesting and you want to do that. And then error margins, like, you know, for example, in a transcription system, it may not be as risky as maybe a continuous glucose monitoring system that is not triggered when there's hypoglycemia. You know, with transcription, I'm not saying is right. You know, for example, if someone is saying, oh, the marijuana filtration rates, and maybe this filtration is not spelled properly, but almost everyone knows that this is GFR, compared to a system that is not alerting you, just because maybe your training data is skewed towards, is skewed towards maybe only alerting when there is hypoglycemia, and then it has ignored what happens when there is IPO, which is both dangerous. So as the domain experts in this instance, these are important questions to ask. And I guess this is the third time or the fourth time I'm saying this, but do not assume that the people on your team, they know. They do not know. And you must treat it like, this is my responsibility in this ecosystem, you know. And then finally, there's AI governance leadership. This felt long, but I wanted you to also know that you're also going to be a governance leader. And it could be in ethical consideration. People have mentioned IPA. It's easy in healthcare to know that there is IPA, there's PII. The truth is that a random person who can build AI system does not know all those things. And no one is going to spread the team because nobody knows the ethical consideration. So as the clinician, this is another way you come in as a governance leader. Like what is the ethical consideration of this project? The second one is the regulatory compliance. Regulatory compliance, I'm also speaking from the perspective of you being the person that will be analyzing the systems when they bring them to you. Of course, there may be slight differences in how each concept is described and things like that. But I also, you know, I know that in healthcare regulatory compliance, even if there's AI, even if there's machine learning, even if there are big things, some of it is going to come to your table. And your baseline perception should not be, oh, all these tools are useless. For example, when I was talking about analyzing a lot of data, if you're analyzing just the 100, 1000, but when there is real scale, when there's real big data, machine learning systems, they usually excel. They usually excel and they do interesting work. So your judgment should not just be based on ASA issue. Your judgment should not just be based on, they said these tools, they are doing like this, they are doing like this. Your judgment should be objective, should be pro-research and obviously bias and fairness. Sometimes I think if people have their way, it's easy for me to say, oh, I didn't do that, it's the machine. But again, I think I'm praising healthcare, but healthcare is one of those places where you cannot say that a machine did that. We have so many people with licenses who have sworn different types of oaths from graduation, so many people with different type of licenses that are at stake. So it makes sense for us not to say, oh, the machine did that. The machine learned from the pattern of the type of data we provided. So obviously some of them will require us to adjust. Sometimes there's been a change. That means the thing of integrity is to train a new model. You know, that has learned that. The thing of integrity to do is in the selection of our training data. So there's a lot of work in the pipeline for people with domain knowledge, clinical domain knowledge in the whole AI ecosystem. Because the truth about healthcare is that anywhere AI is going to, is useful, anywhere there's a lot of data. So this is not one of those things like, oh, let's fight against it. We cannot fight against it. Like I mentioned, if we said, oh, people should not use chatbots, people should not use generative AI tools. That's a lie. People are going to use all those things. And many times it improves the quality of their work and you don't even know where that came from. So rather than say people should not use this, you want team, you want people who can make informed decision that this is our protocol. This is our protocol for using tools like this. This is how we are willing to adjust when there is a change. So it better we embrace it rather than just say, let's throw it away. So now let's do our third hypothetical use case. I'm going to come back to our window. And now we have, people have left interesting comments for me. I'm going to read that. Thank you, everyone. I'm grateful for your contribution. So we have two hypothetical use cases for some of the ways I'm thinking about AI governance leadership. So for example, to your table, you're part of the IRB or maybe ethics team, depending on what you call it for your team. And there is a request for approval and monitoring of using natural language processing tools. Natural language processing is a part of AI focused on sequences, things like speech, things like text data. And then we want to apply it to some data just to put some structure. We want to maybe look for specific lab values and you're part of the people who will be making decisions about it. What type of question would you be asking the team that is working on this project? And then the second one, which is the second one, of course, some people will say no to big tech. Like we do not want them using our data, but let's say the executive team of your hospital, of your center, of your research center have agreed on your behalf, even without you. But now there is a request for partnership for access to de-identified data. What are the important questions you would like this team to answer you? What are the right questions you think people at this level of decision-making should be asking whoever wants to do this project? On the chat window, Usha said, Usha, I'm sorry if I'm not pronouncing your name well, but Usha said clinical relevance would take a very sophisticated protocol. I was asked to see a patient with hypoglycemia, the PCP or the insulin level, but not BG at the same time. This is interesting, Usha. I like that. I wouldn't have thought of this. I don't work in endocrinology, but I believe what you're saying, let me interpret it well, is that sometimes what the hypothetical tool is monitoring is not what we need, and there is something else going on. And then Igor says, should we consider emergency measures? That is alternatives if system is down or up? Yes, Igor, yes. Because, especially when it comes to healthcare, even without healthcare, like I mentioned, I've mentioned something called model degrade. It's not because the team is evil or you didn't do the right thing, but the data pattern changed. Like sometimes the age group of the patient coming has just changed over the time and the data fed into the model is not following that pattern. So in many systems, what they do when they have granular control over their method, many times, well, many times it was automated, but some people, every four weeks, they automate. For example, many things that happen in FinTech is usually AI, you know, anomaly detection. We think this is happening. So in many systems, what they do is that they have a regular training schedule and a lot of fail-safe. And what I would assume about many of those systems, especially in healthcare, is that there is usually human in the loop anyway. Actual clinicians, we look at some of those responses. Yes. Yes, Igor, and even approach to malpractice due to error by the algorithm, yes. Maybe insurance company, we need to consider it because now some of the tools, you're not even the one, they are acquired tools, you bought them, but is AI somewhere? So it makes sense for you to ask those questions. Igor said for de-identified data, can the model re-identify? This is an important question, Igor. It happens in practice because many times if we, this is an important questions. It happens in practice to people like, can this be reversed engineer to know the people who are at, you know, to know the people who have been talked about? What I know as a practical example, practical response to this is to treat it like every other access to data in healthcare that is not even for AI purpose. For example, in some systems, you are required to talk to the patients. It may be impractical to talk to all patients in that system, but the data will be treated as if, what if this, what if de-identification failed? Some of the data sets used for AI in healthcare, there's a popular one called MIMIC, different series. It's just case note from hospital, it's de-identified. Sometimes for the work you want to do, you don't want it to be that de-identified, but sometimes for the work you want to do, it works well. So it's just treating it like every other research where people look at case note. That's why I'm saying we cannot ostracize AI. There are so many other research people do it, you know, case notes that have real people's name. So what I would suggest is that everybody should just go through the typical protocol for that. How the data is saved, restriction on people who have access to where the data is stored. And if it's not necessary, they should not have access. So Igor, this is an important debate. Sometimes you de-identify, and for some reason, some of them is not truly de-identified. And again, this is where clinicians come in. Sometimes it's a true error. Someone does not know an information is PII. They are not evil, but that has happened. And that could be your job as the domain expert in that particular instance. And then Igor also says for NLP, oh, well, many, he said for NLP, for the model human reviewer concurrence rate, at what point do we say good enough? I suspect what you're asking is if we are labeling data, when do we have a match? In China, if it's maybe about data labeling or human reviewer, it's a popular area of study in data curation. Sometimes depending on how you need it, there are percentages. For example, there are different measures of agreements. So let's say we want to train an AI system. I am labeling, Igor is labeling. We usually, what we do in practice is that we will have, we will try to have like three, maybe two or three people labeling, depending on how much we have. And we'll have an agreement rate on the first sample. Sometimes, depending on how big the data, sometimes you only need to agree up to 10% of time, which sometimes it looks low, but if you are labeling a lot of data, and sometimes the match you are trying to make, for example, I've been part of a project where we're looking for specific answers in, so let's say I have a question or like these questions that I have for you. All of us have a series of, all of us have a series of, I'm checking the time. Oh, we are rounding up. Sorry, we have just four minutes, but let me finish with that. So all of us have, all of us are trying to answer questions maybe from scientific publication. You found your right answer in the abstract. I found mine in the body of the text. We are both correct. So what we finally did was to compare the meaning rather than the exact span of text. And then Usha said, we have a long way and much more sophisticated system. Urology or the prolactin level in men with ED. The level was mildly elevated. He recommended treatment. Well, this is interesting, but Usha, the truth about this use case is that, again, this is where domain expertise comes in, such that let's say we were designing a system for this prolactin level. That's why our outcome will not be like, we are sure that this is the cause of this. You'll be a bit tentative about it, but this also happens in practice where we think that this is the source of a particular problem. So I'm going to say that is not an AI problem, but where it is an AI problem is for us to train systems that are aware of those use cases. Usha said, partnership who writes others do not necessarily know everything. Yes, I agree with you. I agree with you. AI radiologist agreement. Well, sometimes I don't know, but pneumonia or not. Well, interestingly, sometimes again, machine learning and AI system, they are very good with analyzing big data sets. That's why our first response should not be like, oh, they are useless. They are useful in saying things sometimes we do not see. But remember the training data is based on what humans label, but sometimes they can truly see patterns we cannot see, especially in images. If it's really tiny, but with patterns and repeated exposure, if you've trained something with a million data sets, it's truly has the capacity to learn those patterns. And that's part of what makes, you hear about deep learning, learning algorithms. They can truly learn those things and figure out the formula. It's truly fascinating in practice. So while we cannot elevate them that, oh, this is the true all in all. Again, I'm going to say there are different reasons for human in the loop, but somewhere there are humans in the loop. And as I'm rounding up this, we have just one minute left. Clinicians, whatever you do at work, you are the human in the loop in developing systems like this. I've mentioned again that not everyone are going to be coders, but you could be an end user. You could be a domain expert on a project. You could be the AI governance person. And in all those places, we remain stewards of our work to people, to help people to adopt new technologies and to give our best to it. Thank you everyone for listening. I'm going to say thank you to everyone who participated in the live chat. Usha, Igor, Nikita, let me scroll, Raul and Nadine. I'm trying to go back to the top. Lena, I think this is everyone who has supported this session. Thank you for listening. That'll be all.
Video Summary
In this session of the AI and Healthcare Virtual Summit, Wura Ola Oyewusi discusses how AI can be integrated into healthcare, particularly for decision-makers. Three key roles in AI integration are identified: the end user, the domain expert, and the AI governance leader. End users interact directly with AI-powered tools, such as those used for research or clinical practice, and should focus on the tool's usability, reliability, and safety. Domain experts contribute their clinical knowledge to ensure data quality and relevance, while AI governance leaders consider ethical and regulatory aspects, such as patient data privacy and tool reliability.<br /><br />Oyewusi emphasizes the importance of approaching AI tools as part of software systems that require regular updates and maintenance. She highlights the wide applications of AI in healthcare, such as using machine learning tools for efficient data extraction from clinical notes or developing domain-specific tools like transcription systems for endocrinology. By treating AI as a software component and asking the right questions, healthcare professionals can ensure these technologies are effectively integrated into clinical settings to enhance patient care and research outcomes. The session encourages proactive engagement with AI tools, underscoring that while technology might evolve, the need for human oversight and expertise remains critical.
Asset Subtitle
Wuraola Oyewusi
Data Scientist and AI Technical Instructor
LinkedIn Learning
Keywords
AI integration
healthcare
decision-makers
end user
domain expert
AI governance
patient data privacy
machine learning
clinical practice
EndoCareers
|
Contact Us
|
Privacy Policy
|
Terms of Use
CONNECT WITH US
© 2021 Copyright Endocrine Society. All rights reserved.
2055 L Street NW, Suite 600 | Washington, DC 20036
202.971.3636 | 888.363.6274
×