How Can Machine Learning Boost T1D Patients’ Time in Range?
Yao Qin, PhD, assistant professor at the University of California, Santa Barbara, is no stranger to the work involved in staying healthy with type 1 diabetes (T1D). Diagnosed in 2011, she deals with the daily challenges of maintaining the carbohydrate-to-insulin ratio for every meal and ensuring she makes the most of her exercise sessions without spending too much time above or below range.
Her efforts to simplify both tasks while improving accuracy — for herself and for others — was the focus of her recent presentation at the Endocrine Society’s Artificial Intelligence (AI) in Healthcare Virtual Summit.
“I'm super glad that combining my life with T1D and my expertise in AI finally can converge to help patients like me better control our blood glucose and live freely as normal people,” she told attendees.
To tackle the challenge of counting carbohydrates for every meal, Qin has led the team that produced NutriBench, a publicly available natural language meal description database. On the exercise side, she has received a grant from the Helmsley Charitable Trust for a project testing the impact of automated insulin delivery (AID) in helping people with T1D maintain their time in range when working out.
Estimating Carbs Automatically
For people with T1D, carbohydrate estimation is required before every meal so patients can roughly gauge how many units of insulin they need to inject to keep their blood glucose within the range. “Right now, most patients do a manual estimation, which is very challenging because it's hard to memorize all the carbohydrate nutrition information that’s required,” Qin said.
Currently, for a breakfast of, say, 2 scrambled eggs, 12 blueberries, 5 strawberries, and 1 slice of buttered toast, a patient can do a Google search for each item, then add all the carbohydrates together for a rough estimate, she said. “But that approach takes a huge amount of time, and so in reality, what I do is bypass the process and just make a random guess. Then I input that guess into my hybrid closed loop system [combination insulin pump and continuous glucose monitor] and start eating. Then usually my blood glucose isn’t very good, so I need to check my monitor every 5 minutes to see whether I need to take further action.”
By contrast, large language models (LLMs), a type of AI that can generate and understand human language, can do the same task of producing carbohydrate estimates automatically, based on real-world food descriptions, she explained.
For example, a user can prompt: “For breakfast, I'm eating 2 eggs scrambled with a slice of buttered toast, 5 strawberries, and around 12 blueberries.” The model will immediately provide the number of carbohydrates for each food item, as well as the total amount of carbohydrates in the meal: 2 g of carbohydrates for the eggs, 13 g for the buttered toast, 4 g for the strawberries, and around 2 g for the blueberries.
These calculations are doable right now with existing LLMs. The question is, she said, how accurate are those LLMs? And that’s where NutriBench comes in. The NutriBench dataset is based on 11,857 meal descriptions from 11 countries annotated with macronutrient labels, including carbohydrates, proteins, fats, and calories. The team used it to benchmark 12 currently available LLMs on the task of carbohydrate estimation from natural language meal descriptions, using different prompting strategies.
After testing and comparing output from the various LLMs, they conducted a real-world risk assessment study to demonstrate the impact of the estimations by simulating the effect of the carbohydrate predictions on the blood glucose levels of 20 virtual T1D patients.
Across 44,800 simulations, they found that carbohydrate estimates by GPT-4o mini was the LLM that led to the lowest blood glucose risk and highest time in the safe glucose range (70-180 mg/dL).
“Even more exciting, we invited three human dietitians to do the same task as the models — ie, give a carbohydrate estimation for 100 randomly selected meals,” Qin said.
The GPT-4o mini achieved the highest accuracy of 69.82% of the time in range compared with 66.97%, 66.88%, and 65.93% for the three nutritionists. The results, she said, “highlight the LLMs’ potential as valuable tools for obtaining precise and accessible nutrition estimates from natural language meal descriptions.” The team is now working to determine whether they can fine-tune existing LLMs by training them on specific nutrition information datasets. The goal: To turn them into “LLM nutritionists.”
Exercise-Specific AID System
Qin moved on to the effects of exercise, again referring to her own experience. “If I suffer from hypoglycemia when I start running on a treadmill, I know I’m supposed to reduce my basal insulin to some extent,” she said. “But I don't know how much I should reduce it by to keep my blood glucose within range. And because I have no knowledge yet, which I believe is the case for most patients, I reduce my basal to a random degree.”
“Then I start running, and unfortunately this random degree usually is not enough,” she continued. “My blood glucose drops, and then I have to stop the treadmill and start blindly eating sugars.”
Generally, Qin enjoys eating. “But when I start this kind of carbohydrate rescuing, meals don’t bring me happiness, just tons of stress. I feel physically uncomfortable, but I just keep eating, trying to compensate for the blood glucose drop. And this can then lead to hyperglycemia. Half an hour or an hour after eating blindly, my blood glucose starts increasing insanely. It’s a back-and-forth pattern that’s so annoying and stressful.”
The Helmsley Charitable Trust’s T1D Exercise Initiative (T1-DEXI) datasets provide baseline information on the glycemic response to exercise in adults with T1D. The trust’s grant is enabling Qin and colleagues to conduct a 3-year multiphase project to help people with T1D maintain their time in range during and after exercise using AID.
Analyses of T1-DEXI data have shown that glucose drops, but to different degrees, when engaging in the four most frequent activity types: Dog walking, biking, jogging/running, and strength training/weight lifting, Qin said. Work by her group has shown that the extent of the change in glucose levels depends on various factors, including starting glucose, age, sex, and carbohydrate intake, and that exercise duration also has an impact.
"Whenever you raise your target during exercise, you also reduce your basal insulin to a certain degree to compensate for the drop in glucose," she explained. A preset reduces the delivered insulin and also changes the insulin sensitivity factor, and it enables an estimation of blood glucose at any point after it’s applied.
For the Helmsley-funded project, the team is designing algorithms for static and dynamic activity-specific presets to predict insulin needs and reduce the risk for hypoglycemia during the activity. A static preset is the median of the optimized presets for each activity. A dynamic preset integrates the predicted impact of the factors affecting exercise into the algorithm.
The project also includes a clinical evaluation of the static activity–specific presets when integrated into Tidepool Loop, a currently available, Food and Drug Administration–approved app that automates insulin dosing. If successful, the team’s presets would enhance the Tidepool Loop algorithm, making it more effective in handling exercise.
The ultimate goal, Qin said, is for the model “to be smart enough to get familiar with our daily life patterns and become a close friend with us. But meanwhile, by knowing all the things going on in our lives, it can begin to give us personalized insulin recommendations and continually improve over time.”
Qin had no relevant conflicts of interest.
Marilynn Larkin, MA, is an award-winning medical writer and editor whose work has appeared in numerous publications, including Medscape Medical News and its sister publication MDedge, The Lancet (where she was a contributing editor), and Reuters Health.