Skip to main content

Advertisement

ADVERTISEMENT

Western AF 2024 Session

Western AF Symposium 2024: Session 13 Roundtable

The Hype Around Artificial Intelligence (AI): Are We Over-Optimistic About the Future of AI?

Edited by Jodie Elrod

© 2024 HMP Global. All Rights Reserved.

Any views and opinions expressed are those of the author(s) and/or participants and do not necessarily reflect the views, policy, or position of EP Lab Digest or HMP Global, their employees, and affiliates. 

Featured is the Session 13 Roundtable entitled "The Hype Around Artificial Intelligence (AI): Are We Over-Optimistic About the Future of AI?" from WAFib 2024. 

Video Transcript

Moderators: 

Kenneth Ellenbogen, MD, and John Mandrola, MD  

Discussants: 

Felix Hohendanner, MD; Hamid Ghanbari, MD; Chan Ho Lim; Jorge Romero, MD 

Kenneth Ellenbogen, MD: We're going to get started. We're going to talk about artificial intelligence (AI) and the hype around AI. I'm Ken Ellenbogen. I'm one of the moderators, and John Mandrola will be the other moderator. We're going to start it off with our panel telling us a little about who they are, where they are, and what they do, and then John is going to bring out a series of questions for our panel. 

Felix Hohendanner, MD: Yes, good morning. My name is Felix Hohendanner. I'm a cardiologist and biophysicist by training. I perform interventional electrophysiology (EP) procedures, like most in this room, but I have a keen interest in translational cardiology as well. I have a group that is dedicated to investigating atrial cardiomyopathy, and of course, nowadays machine learning (ML) is a big part of that. 

Hamid Ghanbari, MD: I'm Hamid Gambari, I’m a clinical cardiac electrophysiologist at the University of Michigan, and I'm the Chair of Innovation at University of Michigan as well. My research is really focused on understanding signals such as photoplethysmography (PPG), electrocardiograms (ECG), and doing lots of ML work, particularly when it comes to interpretation of ECG and text-based data.

Chan Ho Lim: Hello, my name is Chan Ho Lim and I'm from Tulane University. I'm the Assistant Director of the Digital Health and AI Program with Dr. Nassir Marrouche, and the core interests of our team are digital health utilizing ECG, PPG, and other metrics that exist on these smartwatches today, along with clinical parameters that we collect from the hospital. 

Jorge Romero, MD: I’m Jorge Romero, I work at the Brigham and Women's Hospital, and we have extensive experience with the Volta Medical system.

John Mandrola, MD: All right. Can we bring up the slide? I wanted to make these the topics for discussion, and I just made these up. I want the panel to speak to me like I'm a sixth grader and pretend I don't understand a lot about this, and provide 1- or 2-minute answers. The first question is, what is AI and is it different from ML? So, let's start with that.

Jorge Romero, MD: So, AI and ML is probably the future of EP at the moment. For example, for mapping systems, AI without clinical outcomes is very difficult for the algorithm to get better. So, I would say it's probably a learning machine that is trying to improve outcomes acutely and long term, but it needs an outcome to learn how to define targets in ablation.

Chan Ho Lim: I think AI is probably categorized in a couple different ways. Probably the most popular one is the difference between narrow and general AI, where narrow AI would mean that you're using AI to solve specific tasks. For instance, an ECG classification algorithm would be a narrow AI, and general AI would be something that reasons, like ChatGPT that you use. Is it different from ML? I think it's just terminology. People use it interchangeably with ML, deep learning, and there's all these Venn diagrams that everybody's seen before, but I think it's pretty much used interchangeably. 

Hamid Ghanbari, MD: John always asks the easy questions. So, I would add that they are essentially prediction machines, so you feed the machine a source of data and it tries to make a prediction about whatever that you want it to do. Now, that could be an ECG signal that you feed it and you ask it whether this patient has heart failure or low EF and makes a bunch of calculations, does a lot of fancy math, and tells you with a certain probability that this data is likely to have this outcome associated with it. Now, you draw a threshold of what you want based on your understanding of the disease state, the kind of consequences that if you decide that is the prediction of interest, and you can have different ways of doing ML, you can have labeled the data or ask the machine to label the data for you, and those have different names. You could have foundation models, which there is a lot of interest in recently that is basically a large adaptable AI model that you basically put terabytes of data into it, and then it can do a variety of tasks, some of which you haven't specified before. You can have large language models which are a subset of that group, which essentially learn from the extensive text data you have, has a specific architecture (called transformer architecture) that allows you to understand the relationship with the words and then generate the next word for you, so you can ask interesting questions and get responses that you didn't expect before. 

Felix Hohendanner, MD: To make a long story short, it's just terminology, I'd say. 

John Mandrola, MD: Before we go to the next question, why are the computer readings on an ECG so terrible? Because I've been in medicine for maybe 30 years and they're no better. What's the problem there? 

Hamid Ghanbari, MD: You should come to the University of Michigan. Our models are pretty good. I think that's a big part of it. I think that the models are getting better and better, and they're not widely distributed. Every center is doing their own, they're kind of using their own data to train it so the performance is very different. I think we'll get better once we start to learn how to combine data from different centers that have very good labeling. So, if you're reading it and a human labeled it wrong, then your machine is just going to be just as bad or worse, right? So, I think my answer is that it's good, it's just not widely distributed. 

John Mandrola, MD: But the computer readings that we're using now, that's not really what we're talking about AI or ML, is it? 

Hamid Ghanbari, MD: If you think of it as a prediction machine, then it is a very simple form of that. A CHA2DS2-VASc score is also predicting a risk of stroke in the future. Now, as you get more sophisticated, you do more sophisticated math than just adding numbers. I start to do derivatives and more complicated algebra, and then I can create new features so that I could do better predictions. So, it's essentially an extension of that. It is a very rudimentary way of doing it. Now, we're taking that ECG data instead of having Drupal-based algorithms. We take that ECG data, turn it into a time series, and push it into a deep learning network, and it gives me a label. 

John Mandrola, MD: Are there any AI programs in use today that we use all the time? 

Felix Hohendanner, MD: Well, I think that's also a regulatory issue right there. I mean, at least in Germany, it's kind of hard to actually get to work with these programs because you don't have the regulatory approval. I know that in the US it is different now. The FDA is very broad, so once an AI algorithm actually gets the approval, it just keeps going. So, even though there are updates maybe to the software, it can still be used. I think clinical information systems like Epic or others actually incorporate AI-derived risk scores and so on already. 

Chan Ho Lim: At least on the commercial side, we see it every day that you measure your ECG on an Apple or Samsung Watch, and then you see the rhythm classification that's done by an AI model. 

Kenneth Ellenbogen, MD: It's my understanding that at Mayo Clinic, they have basically an AI algorithm that looks at the ECG and uses some data somehow from the PR interval and predicts the risk of atrial fibrillation (AF). That's in use, not in every institution, but clearly there. 

John Mandrola, MD: I saw that when I was there. Let's take a question. 

Audience question: My question is for Hamid. I'd like to pick you up on something you said. You talked about the CHA2DS2-VASc score, and that if you apply more algebra to the same tool, you'll get more out of it. But actually, what we're doing is we’re sharpening the tool with AI. So, the AI is seeing things in the ECG that we cannot see, and that's actually making the tool sharper rather than trying to use the same blunt tool in a more sophisticated way. So, it is very different. 

Hamid Ghanbari, MD: It is in that you can look at signals that you wouldn't otherwise be able to look at, right? Like you mentioned, I could take an ECG, divide it up into a continuous signal over time and then feed it into my deep learning algorithm, and then get lots of different predictions out of it, where I wouldn't be able to do like I was doing before. It allows us to look at data streams that we wouldn't be able to otherwise look at, like the ECG or continuous PPG signal, for example, electrodermal activity signal, and large amounts of electronic health record (EHR) data. So, in that, you are correct.

Audience question: So, you are likening it to, for example, taking the CHA2DS2-VASc score, which is a fairly blunt tool, to which I think you have referred to applying algebra or creating other sophisticated uses of that tool to see whether it correlates to anything more interesting and whether you can try and use the same blunt tool in a more sophisticated way. I think there is a fundamental difference. In the ECG, we can measure slopes, T-waves, T-wave dispersion, and anything we choose—that's using the blunt tool in different ways. But I think AI really is sharpening the tool, isn't it? 

Hamid Ghanbari, MD: Definitely. 

John Mandrola, MD: Let's skip down to the potentials and positives. Give us the best-case scenario.

Hamid Ghanbari, MD: To follow up on your question, at Michigan we're doing lots and lots of new research and deployments with the large language models. I know how many of you clinicians here have an in-basket that is overflowing with information. So, we have an in-basket project where it digests the information, generates a response, and can really be wonderful for picking and saving the time that you have to answer questions. There's been some data published that has shown that the responses that the AI generates are actually pretty good and often maybe a little more empathetic than what humans generate. We deploy it for clinical trial eligibility evaluation, so digesting large amounts of data and understanding which patients would potentially qualify for some of your clinical trials. Or if you have a new program, like, well I wouldn't say left data appendage closure, but maybe a cardiac contractility modulation program, for example, and you're looking to see which patients qualify for it, you could digest large amounts of data and identify patients. We have a pilot for dictation and no generation where ambient noise recording that generates a structured note for you where you can agree or add or do. So, those are really interesting deployments that we're doing right now. I think if deployed correctly, it could really ease the administrative burden in the next 5 years. 

Jorge Romero, MD: So, one of the advantages, at least for the electroanatomic mapping system, is, for example, in persistent or longstanding persistent AF, many times we do the PVI posterior wall, and we don't really know what to do. So, these mapping systems can give you guidance. I mean, you have to decide if you're going to be ablating the whole left and right atrium, but it will give you some areas that might improve clinical outcomes. However, most of this software thinks that AF termination is the outcome that we're looking for. Even though we feel great every time we terminate AF, or an AF organizes into atrial flutter, we know by many studies and even a substudy of the study by Atul Verma that AF termination doesn't mean anything, and it doesn't translate in clinical outcomes. Actually, in the study, when we did CAFE ablation, we terminated 50% of the AF in those patients versus 8% in PVI alone, and there was no difference in clinical outcomes. So, I think it makes us feel good, but I think we're going to have to see the results of the TAILORED-AF trial at HRS 2024 and see if termination works. 

John Mandrola, MD: It seems to me that the Carto people will tell me that there are programs that will tell me where to burn, but it seems that's a really difficult area for AI, because we don't understand so much about AF as it is. Is that correct thinking? 

Jorge Romero, MD: I would say the difference between, for example, when you're using Volta and CartoFinder from Carto, you only have to position your rate 2 or 3 seconds and you get the lesions versus 20 seconds with CartoFinder, but yes, it’s pretty much the same. 

John Mandrola, MD: Any other positives? 

Chan Ho Lim: Yes, I would like to add that AI really came from almost overabundance of data and a large clinical history from a large number of patients in EHR, but also patients using smart devices every day, collecting a lot of useful clinical information. That is where AI can also be useful, not only to find something new for us and find something almost miraculous that's difficult to believe when AI presents that it's a black box without much evidence of how it's making its decision, but on the opposite side when there's an overabundance of data and digesting all this information is where also AI is very useful. 

Felix Hohendanner, MD: Yes, and it can make your life so much easier. In Berlin, we have started to implement for the derivation of scores, so we get the CHA2DS2-VASc score, HAS-BLED score, Euro score, and all these kinds of scores out of the system automatically via a large language model. So, it can be very good for that. I think we have seen in terms of risk verification in the BEAGLE trial that it can be very interesting for us as a first kind of contact of an ECG with the medical system. So, the AI tells us if there is a patient that is at high risk for eventually getting AF just from the sinus ECG, and if the AI says this is a patient who will develop AF, then we go a step further and intensify our screening. In particular, in the BEAGLE trial, 1 in 13 patients with a high AI risk ended up having AF and ended up getting oral anticoagulation who would have not gotten anticoagulation in the first place.

Hamid Ghanbari, MD: Yes, I want to follow up on what you said there. I think it's pretty great for summarization. I mean, I don't know how you all see patients in your clinic, but generally when I see a patient, I go to the last note that was generated by the endocrinologist, and then I look at that and decide what the latest thing was, and then I use that as an update to look at what the problem is. It's very good at taking all the information in the medical record and creating an outline, so I get a wiki that could be updated by clinicians on the first page where it digests all the clinical data on the latest notes on the echocardiogram, the latest updates from the electrophysiologist, where you could go to that wiki page that summarizes everything that's been done on that patient before your clinic visit. I think it's something that's well within reach and I think that would be a tremendous value. 

Audience question: I want to follow up on the first question to all the other people on stage about what the difference is between AI and ML. I want to ask what your understanding is of the difference between ML and statistics, because I think most of the people sitting here are very comfortable with statistics like linear or logistic regression, but maybe not comfortable with AI and ML. But, from my standpoint, my background is stats, so I think AI, ML, and biostatistics, are kind of similar. So, I want to see how you think about the differences and similarities between these two. 

Jorge Romero, MD: I think that's a great question. At the Brigham, we have a massive database and recently hired a specialist in AI. What he did was find ways to analyze, because the data is not learning anything. We're not putting in outcomes or anything that is there. The variables, the outcomes that we have are there, we're not putting more data. But we use the TIMI group. There are like 10 statisticians in the TIMI group, and they try to come up with every kind of analysis to write papers. But when we put this same database with the same outcomes in AI, they were able to come up with like twice or three times the amount of calculations that 10 statisticians from the TIMI group couldn't even imagine that were possible. So, I would say I agree with you, AI is like having 100 master statisticians because it's finding ways to analyze the data in ways that you can even possibly imagine. 

Hamid Ghanbari, MD: I think ML is more like engineering. So, you're doing a lot of things, you're trying different things, and you're trying to get to a real outcome in the real world. Like you said, you could do a lot of math in a way that otherwise wasn't possible before. Chan, how do you do this all the time? What do you think? 

Chan Ho Lim: Well, he sits next to me in my office, so we have this discussion all the time, but there's definitely a little bit of a gray line with statistical models and ML models, and for me it's the learning that's ML. So, it’s the iterative processes and the learning curve you see from the models. Also, you cannot really do any ML without a good understanding of statistics. You must understand your probabilities very well and you have to know the distribution of your data really well, and that's why I need you in my office. But yes, I would say that the difference between ML and statistical programs in general is the learning aspect of it and how it gets better through iterative processes. 

John Mandrola, MD: Let's take the next question. 

Sana Al-Khatib, MD: Thank you, this is a great discussion. I'm personally very excited about AI and I have a big picture question, which is that we've seen so many publications now showing that AI works in terms of predicting. This is an AF symposium, so I'll use that example, but there are many other examples of looking at someone's sinus rhythm on ECG and being able to predict if they're going to develop AF. Why is it that these are not making it into clinical practice? Like, what are the hurdles that you see in terms of implementing AI in clinical practice? Will we get there and how can we get there? 

Chan Ho Lim: On the technical perspective of things, I think generalizability often becomes an issue. So, if you have developed a model from a single source, let's say ECG from a certain company, and then a single center from a specific group of patients, and then when it goes to different clinics using different devices, it may not work very well. But it's also generalizability. I wish that in medicine, everybody could share data in one place and then people develop foundational models together, but that's a really big dream. It would really make the development of models super fast, but yes, that's just my take on the technical side of things.

Jorge Romero, MD: For electroanatomic mapping, we're using the AI to do AF ablations, but for these companies to convince all the EP community to use, let's say, AI for everything persistent AF, they're going to have to prove in a randomized controlled trial that it is better than PVI alone, correct? For example, TAILORED-AF enrolled 374 patients, so if you use 80%-85% power, you need to demonstrate that 15 or 16 absolute difference, and if the event rate is 40%, you have to do a relative risk reduction of 40%-45%. It's almost impossible to achieve in a clinical trial, so if you want to show the difference of 5%, you will have to enroll 2000 patients. 

Kenneth Ellenbogen, MD: Let’s go to the next question, please.

Edward J. Schloss, MD: Yes, getting down to more mundane subjects but things that are pain points for all of us, I think most of us in the room who are facile with our techniques would say, cannulating the coronary sinus is nowhere near as hard as doing an outside chart review in electronic records. The tools we have for that are lacking. Any of us who have done that have faced a giant run-on sentence and are trying to sort through that begs for a large language model to be trained on these. I would do it myself if it weren't so obviously a HIPAA violation. Ask anyone, just take the AF guidelines that just came out and copy/paste them into ChatGPT and ask it to summarize for you the latest evidence on anticoagulation and stroke prevention, and it'll give you a serviceable answer that’s actually right and rather remarkable. When can I do that with Epic? I don't know if these guys are the ones who answer that, but it should be happening tomorrow because it's ready to go. 

Kenneth Ellenbogen, MD: We should be asking Epic that, but does anyone know? 

Edward J. Schloss, MD: They never listen. They're never here. 

John Mandrola, MD: Let’s keep this brief. We have a lot more questions.

Hamid Ghanbari, MD: Epic has a module for large language models, it's in the experimental mode. There are issues with privacy, because you can have your model and close it and train it on your own data. So, it's happening, it's already deployed in some select hospitals, and like you said, this is going to be the low-hanging fruit for us to hit. I think by getting and digesting clinical data and providing a summary for clinicians, you can do your job faster and better, and that is where you're going to see the first real big effect of this. 

Edward J. Schloss, MD: Predict to me like when in months and years I'm going to be able to use this in my clinic. 

Hamid Ghanbari, MD: The holdup is not as much about technology as it is regulatory payment, because there's no real way to get paid for that at this point.

Edward J. Schloss, MD: Well, Epic will find a way to make us pay for it. 

Hamid Ghanbari, MD: It's expensive computationally, because you have to train a ton of data and tokens can be very expensive, especially if you're doing it for every patient. So, how do you get some of that value back? Is it time-safe for a clinician, and how do you quantify that? There are ethical issues that you must think about, such as what is the generated outcome and is there a physician that has a way on that? Is the data diverse enough that you can actually be confident that the results you're getting are good enough for you? 

Kenneth Ellenbogen, MD: Great question. I wish we could give you a better answer. Next question.

Audience question: So, I think one of the fields where AI has really been accepted is with breast mammography. In our center, the breast radiologists, like every month, have a random selection of the reads and it's compared with an AI read of thousands of these images, and I just wonder if there isn't something within cardiology and your imaging that might actually be a learning tool. 

Kenneth Ellenbogen, MD: Nassir, maybe you could comment on that? You probably know a lot about that, AI looking at cardiac MRIs. 

Chan Ho Lim: What's the specific question? 

Kenneth Ellenbogen, MD: AI and interpreting cardiac MRIs. So, more accurate, more standardized. 

Chan Ho Lim: AI is really good especially when it comes to data with complex dimensionality like cardiac MRI. We've been working a lot on this, especially for automation of segmentation algorithms for the atrium and fibrosis and scar. But like Dr. Gambari said before, everything is really dependent on the quality of the label themselves. AI models for specific tasks will only be as good as the human labels that it's trained on at best. So, I don't think it's a reasonable expectation to say that AI is going to discover something, like it's going to do a task that humans do better than they do when it's trained on the labels from there.

Nassir Marrouche, MD: The whole dilemma is being solved now by industry using AI, so you don't have to worry about reading and imaging. Processing is much better.

John Mandrola, MD: Next question. 

Audience question: How do we incorporate in our AI models the dynamic nature of paroxysmal diseases like AF? Now, all the models used one ECG to predict events in the future, and that could be okay for ejection fraction or sex. But for AF, how do we put that in our practice? 

Chan Ho Lim: I think there's a lot that's been going on, so a lot of companies that are making smartwatches like Google Fitbit or Apple or Samsung provide AF history based on PPG from continuous signals. They call it AF history, not specifically burden, but you would be able to detect and then from any continuous monitor, I don't think it's much of a question about the AI models, but more of the tools that are being used to collect any kind of continuous signals. 

Hamid Ghanbari, MD: There's a question of dynamic risk prediction that oncology has been struggling with for forever. I think that it’s a difficult problem, but it's not an unsolvable problem, and I think a lot of what you do depends on what you would do with that information, right? So, if I'm going to predict paroxysmal AF with some degree of uncertainty in the future, what's the next step for me? Is it an echocardiogram or is it more noninvasive monitoring like an Apple Watch? Then probably I don't have to worry as much about the dynamic nature of it. If I'm trying to predict AF in the intensive care unit using a continuous ECG monitor, then it matters for me in the next 5 hours what my of AF is because I would have different actions associated with it. So, dynamic risk prediction is a difficult problem, and it really depends on what actions are associated with that risk prediction.

Audience question: It's more to implement some kind of treatment to prevent that in a kind of a closed-loop system. That's what I have in mind. 

Hamid Ghanbari, MD: Yes, exactly.

Audience question: Thank you. 

Kenneth Ellenbogen, MD: We have about a minute left, so I just want to say a couple of things. Everyone is absolutely fascinated by this topic. Where more could it have an effect than in EP? Because all the data we use—the imaging data, intracardiac electrograms, ECGs, and MRIs—it’s all digital data. For example, we're doing a study with 20 other centers on cardiac sarcoidosis, and we're sharing the digital data from all these centers to try to pick up a model that will predict which patients with cardiac sarcoidosis are going to have VT so all their data can be shared. We talked about how important it is clinically. There obviously are financial implications. People want to do this, but they want to get paid to do it. But I would encourage everyone, since this is their life and this is what we're going to be dealing with, and to read as much about this as possible because it's coming soon. HRX is going to talk a lot about this in Atlanta in September, and we all need to learn more about it because it will affect every aspect of what we do, from the patient who is referred to you who has 15 medical records elsewhere to trying to predict when a patient has their next episode of AF. So, thank you to the faculty and to John. We could easily go on for another hour. There were more questions. Feel free to talk to the faculty about your questions, but we have to move on to the next session. Thank you.

The transcripts have been edited for clarity and length.


Advertisement

Advertisement

Advertisement