67 - Development and validation of simplified machine learning-based model and risk prediction tool for childhood hypertension/ elevated blood pressure: a population-based study
PhD candidate The University of Hong Kong Hong Kong, Hong Kong
Background: With the concurrent increasing number of overweight and obese, childhood and adolescent HTN has more powerfully raised the alarm about this under-attended burning health problem. However, there are few predictive tools for early identification derived from larger general population data sets, using simplified but not blood pressure measurement predictors as input. Objective: This study aims to develop and validate machine learning algorithms that predict short-(1-year) and long-term (3-year) pediatric BP conditions, and to identify the key predictors for simplifying and conducting a predictive risk tool. Design/Methods: We utilized demographic, physical, and psychological well-being predictors to model blood pressure (BP) conditions (normal, elevated BP, and HTN) for 1-, and 3-year periods via multiple academic cohorts from a population-based longitudinal study across Hong Kong. Several machine learning models were generated 10 times to estimate an average performance of accuracy, the area under the receiver operating characteristics curve (AUC), precision, recall, and F1 score. Shapley Additive Explanations (SHAP) were performed to select the simultaneous and separate key features for short- and long-term prediction. A predictive risk tool was further constructed with the top-ranked predictors. Data of students who enrolled from 1995/96 to 2019/2020 and were randomly divided into training and test datasets. Results: Of the 1.9 million students in the analysis, 74,244 and 111,195 students had elevated BP and HTN in the next year, and 55,834 and 94,103 students had elevated BP and HTN in the next third year. The XGBoost algorithms demonstrated the highest accuracy (Macro-average AUC 0.92, Micro-average AUC 0.91) for the 1-year prediction, consistent accurate prediction for the 3-year prediction (Macro-average AUC 0.91, Micro-average AUC 0.90). Besides, the XGBoost model with only 17 final-selected input predictors produced comparably accurate predictions. A predictive risk tool was provided by the top-ranked predictors: sex, weight, and age.
Conclusion(s): In a systematic framework of model development and validation, XGBoost consistently outperforms for predicting short- and long-term children and adolescents hypertension; with only 17 non-BP measurement features to accurately stratify students into non-, elevated BP, and HTN risk groups. Both the simplified prediction model and the predictive risk tool are easy for pediatricians and parents to use in daily practice, serving as individual decision-making to alert the potential high-risk or guidance systems for personalized prevention of hypertension.