Skip to main content
News

Precision Phenotyping Algorithm Enhances Identification of Long COVID Cases

Postacute sequelae of COVID-19 (PASC), also known as long COVID, remains a pivotal issue of the COVID-19 pandemic. Many diagnosed with COVID-19 experience long COVID months after recovering from COVID. However, identifying those patients is challenging due to imprecise definitions and limitations in current diagnostic codes, such as U09.9.

A recent study introduced a precision phenotyping algorithm employing an attention mechanism to enhance accuracy, reduce, biases, and improve cohort identification for research. This algorithm has allowed for detailed exploration of the genetic, metabolomic, and clinical features of long COVID.

The algorithm analyzed longitudinal electronic health records (EHRs) from over 295 000 patients in Massachusetts and integrated temporal pattern mining to exclude conditions explained by past medical history, provided more accurate diagnoses. Validation through chart reviews showed a high concordance rate of 88%. The algorithm accounted for infection-associated chronic conditions (IACC) and used a diagnosis of exclusion. It outperformed UO9.9 in identifying long COVID cases with significantly higher precision (79.9%) and reduced bias.

Findings from the algorithm’s analyses included the identification of long COVID in 24 360 patients, with an adjusted prevalence of 22.8%, aligning with regional estimates. The algorithm revealed that long COVID disproportionately affects women, with significant variations across racial and ethnic groups. Additionally, it found that younger patients are more likely to develop psychiatric and gynecological sequelae, whereas older patients are more likely to exhibit endocrine and musculoskeletal effects.

Systemic and respiratory issues were identified as the most common symptoms of long COVID. Notably, the algorithm also better captured rare symptoms such as vision loss, diabetic complications, and sexual dysfunction. It found that symptoms of long COVID typically emerge within 3 months postinfection and persist for 2 months or longer.

The impact of this algorithm enables research into genetic, clinical, and metabolomic aspects of long COVID, which have previously been understudied. It offers a reproducible framework for identifying patients with long COVID in diverse healthcare systems, promoting a more accurate cohort construction for clinical trials. This advancement is anticipated to enhance clinicians’ ability to conduct research and provide targeted treatments for patients with long COVID.

“Compared to the conventional U09.9 diagnosis code, our method for identifying PASC boasts superior precision and exhibits less bias, accurately gauging the prevalence of this condition without downplaying its significance, offering a more nuanced understanding of patients with long COVID,” concluded Alaleh Azhir, Clinical Augmented Intelligence Group, Massachusetts General Hospital, Boston, MA and coauthors.

Reference

Azhir A, Hügel J, Tian J, et al. Precision phenotyping for curating research cohorts of patients with unexplained post-acute sequelae of COVID-19. Med. Published online November 2, 2024. doi:10.1016/j.medj.2024.10.009