The findings from diffusion-weighted magnetic resonance imaging (dMRI) studies often show inconsistent and sometimes contradictory results due to small sample sizes as well as differences in acquisition parameters and pre-/post-processing methods. To address these challenges, collaborative multi-site initiatives have provided an opportunity to collect larger and more diverse groups of subjects, including those with neuropsychiatric disorders, leading to increased power and findings that may be more representative at the group and individual level. With the availability of these datasets openly, the ability of joint analysis of multi-site dMRI data has become more important than ever. However, intrinsic- or acquisition-related variability in scanner models, acquisition protocols, and reconstruction settings hinder pooling multi-site dMRI directly. One powerful and fast statistical harmonization method called ComBat (https://github.com/Jfortin1/ComBatHarmonization) was developed to mitigate the "batch effect" in gene expression microarray data and was adapted for multi-site dMRI harmonization to reduce scanner/site effect. Our goal is to evaluate this commonly used harmonization approach using a large diffusion MRI dataset involving 542 individuals from 5 sites. We investigated two important aspects of using ComBat for harmonization of fractional anisotropy (FA) across sites: First, we assessed how well ComBat preserves the inter-subject biological variability (measured by the effect sizes of between-group FA differences) after harmonization. Second, we evaluated the effect of minor differences in pre-processing on ComBat’s performance. While the majority of effect sizes are mostly preserved in some sites after harmonization, they are not well-preserved at other sites where non-linear scanner contributions exist. Further, even minor differences in pre-processing can yield unwanted effects during ComBat harmonization. Thus, our findings suggest paying careful attention to the data being harmonized as well as using the same processing pipeline while using ComBat for data harmonization.
bioRxiv Subject Collection: Neuroscience