kwneuro.harmonize

ComBat harmonization for multi-site scalar brain volumes.

Classes

CombatEstimates

ComBat parameter estimates from a harmonization run.

Functions

harmonize_volumes(...)

Harmonize 3D scalar brain volumes across scanner sites using ComBat.

Module Contents

class kwneuro.harmonize.CombatEstimates

ComBat parameter estimates from a harmonization run.

These estimates can be used to inspect the fitted model or, in the future, to harmonize new data using the same model via neuroCombatFromTraining.

estimates: dict[str, Any]

Empirical Bayes estimates of batch effects (gamma/delta parameters, pooled variance, etc.).

info: dict[str, Any]

Metadata about the harmonization (batch levels, sample counts, design matrix, etc.).

kwneuro.harmonize.harmonize_volumes(volumes: collections.abc.Sequence[kwneuro.resource.VolumeResource], covars: pandas.DataFrame, batch_col: str, mask: kwneuro.resource.VolumeResource, *, categorical_cols: list[str] | None = None, continuous_cols: list[str] | None = None, eb: bool = True, parametric: bool = True, mean_only: bool = False, ref_batch: str | None = None, preserve_out_of_mask: bool = False) tuple[list[kwneuro.resource.InMemoryVolumeResource], CombatEstimates]

Harmonize 3D scalar brain volumes across scanner sites using ComBat.

All volumes must be in the same voxel space (same shape). A mask selects which voxels to harmonize; out-of-mask voxels are zeroed by default or preserved from the original volumes if preserve_out_of_mask is True.

Parameters:
  • volumes – Sequence of 3D scalar VolumeResource objects, one per subject/scan. Must all have identical shape.

  • covars – A pandas DataFrame with one row per volume. Must contain at least the batch column. Row order must correspond to volumes.

  • batch_col – Column name in covars identifying the scanner/site.

  • mask – A 3D VolumeResource. Voxels where mask > 0 are included in harmonization. Must have the same shape as the input volumes.

  • categorical_cols – Column names in covars for categorical covariates to preserve (e.g. ["sex"]).

  • continuous_cols – Column names in covars for continuous covariates to preserve (e.g. ["age"]).

  • eb – Whether to use Empirical Bayes estimation. Default True.

  • parametric – Whether to use parametric adjustments. Default True.

  • mean_only – Whether to only adjust batch means (not variances).

  • ref_batch – Optional reference batch whose data is preserved as-is.

  • preserve_out_of_mask – If True, voxels outside the mask retain their original values. If False (default), they are set to zero.

Returns:

A tuple of (harmonized_volumes, combat_estimates). The first element is a list of harmonized InMemoryVolumeResources in the same order as the input. The second is a CombatEstimates object containing the fitted model parameters.

Raises:

ValueError – If inputs are invalid (shape mismatch, missing columns, fewer than 2 batches, empty mask, etc.).