Background: Schizophrenia has been associated with lifelong deviations in the normative trajectories of brain structure. These deviations can be captured using the brain-predicted age difference (brainPAD), which is the difference between the biological brain age of an individual, as inferred from neuroimaging data, and their chronological age. Various machine learning algorithms are currently used for this purpose but their comparative performance has yet to be systematically evaluated. Methods: Six linear regression algorithms, ordinary least squares (OLS) regression, ridge regression, least absolute shrinkage and selection operator (Lasso) regression, elastic-net regression, linear support vector regression (SVR), and relevance vector regression (RVR), were applied to brain structural data acquired on the same 3T scanner using identical sequences from patients with schizophrenia (n=90) and healthy individuals (n=200). The performance of each algorithm was quantified by the mean absolute error (MAE) and the correlation (R) between predicted brain-age and chronological age. The inter-algorithm similarity in predicted brain-age, brain regional regression weights and brainPAD were compared using correlation analyses and hierarchical clustering. Results: In patients with schizophrenia, ridge regression, Lasso regression, elastic-net regression, and RVR performed very similarly and showed a high degree of correlation in predicted brain-age (R>0.94) and brain regional regression weights (R>0.66). By contrast, OLS regression, which was the only algorithm without a penalty term, performed markedly worse and showed a lower similarity with the other algorithms. The mean brainPAD was higher in patients than in healthy individuals but varied by algorithm from 3.8 to 5.2 years although all analyses were performed on the same dataset. Conclusions: Linear machine learning algorithms, with the exception of OLS regression, have comparable performance for age prediction on the basis of a combination of cortical and subcortical structural measures. However, algorithm choice introduced variation in brainPAD estimation, and therefore represents an important source of inter-study variability.
bioRxiv Subject Collection: Neuroscience