Skip to main content
News

Patient Health Questionnaire-9 Cutoff Scores in Studies Differ From Population Values

Patient Health Questionnaire-9 (PHQ-9) screening accuracy studies that use data-driven methods to select optimal cutoff scores often differ from population-level optimal cutoff scores and overstate accuracy, according to a study published online in JAMA Network Open.

“Users of diagnostic accuracy evidence should evaluate studies of accuracy with caution and ensure that cutoff score recommendations are based on adequately powered research or well-conducted meta-analyses,” wrote corresponding author Andrea Benedetti, PhD, of the Research Institute of the McGill University Health Centre, Montréal, Québec, Canada, and study coauthors.

The study assessed bias in PHQ-9 accuracy estimates caused by data-driven optimal cutoff score selection using cross-sectional data from an individual participant data meta-analysis database. The database included 100 primary studies with 44,503 participants, of whom 10% had major depression. Researchers used the database to resample 1000 studies of 100, 200, 500, and 1000 participants each.

The population-level optimal PHQ-9 cutoff score was 8 or higher, according to the study. In simulated studies, however, optimal cutoff scores ranged from 2 or higher to 21 or higher in 100-participant samples. In 1000-participant samples, optimal cutoff scores ranged from 5 or higher to 11 or higher. Just 17% of 100-participant simulated studies, and 33% of 1000-participant simulated studies, identified the true optimal cutoff score of 8 or higher.

Sensitivity was overestimated by 6.4 percentage points in 100-participant samples, 4.9 percentage points in 200-participant samples, 2.2 percentage points in 500-participant samples, and 1.8 percentage points in 1000-participant samples, researchers reported, compared with cutoff score estimates of 8 or higher in the population. Across sample sizes, specificity remained within 1 percentage point.

“Clinicians and policymakers who make decisions regarding depression screening should interpret cautiously the optimal cutoff scores for the PHQ-9 and other depression screening tools identified in small single studies,” researchers wrote. “Ideally, the decisions regarding what cutoff scores to use should be based on large, well-conducted meta-analyses or on multiple validations in studies with adequate sample sizes for desired precision levels.”

 

Reference

Levis B, Bhandari PM, Neupane D, et al. Data-driven cutoff selection for the Patient Health Questionnaire-9 depression screening tool. JAMA Netw Open. 2024;7(11):e2429630. doi:10.1001/jamanetworkopen.2024.29630