Semseg batchgen¶
This is a transcription of nnUnet’s batch generator that has the same API as a dataloader.
Note
This code use the library batchgenerators that is not a part of Biom3d’s dependencies. You’ll have to install it separatly.
Dataloader with batch_generator. Follow the nnUNet augmentation pipeline.
- class biom3d.datasets.semseg_batchgen.BatchGenDataLoader(*args: Any, **kwargs: Any)[source]¶
Similar as torchio.SubjectsDataset but can be use with an unlimited amount of steps.
- Variables:
img_path (str) – Path to collection containing the images.
msk_path (str) – Path to collection containing the masks.
fg_path (str | None) – Path to collection containing foreground information.
batch_size (int) – Size of the batches.
nbof_steps (int) – Number of steps per epoch.
indices (numpy.ndarray) – A array of unsigned int representing the possibles index for images.
current_position (int) – Index of the actual image.
was_initialized (bool) – If the batch generator was initialized, used for safeguarding.
- __init__(img_path: str, msk_path: str, batch_size: int, nbof_steps: int, fg_path: str | None = None, folds_csv: str | None = None, fold: int = 0, val_split: float = 0.25, train: bool = True, load_data: bool = False, num_threads_in_mt=12)[source]¶
Similar as torchio.SubjectsDataset but can be use with an unlimited amount of steps.
- Parameters:
img_path (str) – Path to collection containing the images.
msk_path (str) – Path to collection containing the masks.
batch_size (int) – Size of the batches.
nbof_steps (int) – Number of steps per epoch.
fg_path (str, optional) – Path to collection containing foreground information.
folds_csv (str, optional) – CSV file containing fold information for dataset splitting.
fold (int, optional) – Current fold number for training/validation splitting.
val_split (float, optional) – Proportion of data to be used for validation.
train (bool, optional) – If True, use the dataset for training; otherwise, use it for validation.
load_data (bool, optional) – if True, loads the all dataset into computer memory (faster but more memory expensive). ONLY COMPATIBLE WITH .npy PREPROCESSED IMAGES
num_threads_in_mt (int, optional) – Number of threads in multi-threaded augmentation.
- generate_train_batch() dict[source]¶
Generate a training batch from the dataset.
- Returns:
A dictionary with the following keys: - ‘data’: List of input data arrays for the batch. - ‘seg’: List of corresponding segmentation masks. - ‘loc’: List of class location dictionaries (for foreground sampling, etc).
- Return type:
dict
- class biom3d.datasets.semseg_batchgen.Convert2DTo3DTransform(*args: Any, **kwargs: Any)[source]¶
Reverts Convert3DTo2DTransform by transforming a 4D array (b, c * x, y, z) back to 5D (b, c, x, y, z).
- Variables:
apply_to_keys (list[str] | tuple[str]) – Key of the data dictionary to convert, default=(‘data’,’seg’)
- class biom3d.datasets.semseg_batchgen.Convert3DTo2DTransform(*args: Any, **kwargs: Any)[source]¶
Transforms a 5D array (b, c, x, y, z) to a 4D array (b, c * x, y, z) by overloading the color channel.
- Variables:
apply_to_keys (list[str] | tuple[str]) – Key of the data dictionary to convert, default=(‘data’,’seg’)
- class biom3d.datasets.semseg_batchgen.DataReader(*args: Any, **kwargs: Any)[source]¶
Read the data and add it to dictionary.
- Variables:
data_key (str) – Key used to access data in dictionary.
label_key (str) – Key used to access label in dictionary.
loc_key (str) – Key used to access foreground in dictionary.
is3d (bool) – If images are in 3d, not used yet.
handler (DataHandler) – DataHandler used to read data.
- __init__(handler: DataHandler, is3d: bool = True, data_key: str = 'data', label_key: str = 'seg', loc_key: str = 'loc')[source]¶
Read the data and add it to dictionary.
- Parameters:
handler (DataHandler) – DataHandler used to read data
is3d (bool) – If images are in 3d, not used yet.
data_key (str) – Key used to access data in dictionary.
label_key (str) – Key used to access label in dictionary.
loc_key (str) – Key used to access foreground in dictionary.
- class biom3d.datasets.semseg_batchgen.DictToTuple(*args: Any, **kwargs: Any)[source]¶
Return a data and seg instead of a dictionary.
- Variables:
data_key (str) – Key for the input data in the dictionary, default=”data”
label_key (str) – Key for the label/segmentation in the dictionary, default=”seg”
- __init__(data_key: str = 'data', label_key: str = 'seg')[source]¶
Transform that extracts data and seg from a dictionary and returns them as a tuple.
- Parameters:
data_key (str, default="data") – Key for the input data in the dictionary.
label_key (str, default="seg") – Key for the label/segmentation in the dictionary.
- class biom3d.datasets.semseg_batchgen.DownsampleSegForDSTransform2(*args: Any, **kwargs: Any)[source]¶
Transform that generates downsampled versions of a segmentation map for deep supervision.
This transform stores the results in data_dict[output_key] as a list of segmentations, each scaled according to a corresponding entry in ds_scales.
- Variables:
ds_scales (tuple | List) – Scaling factors per deep supervision level. Each entry can be a float (same scaling for all axes) or a tuple of floats (individual scaling per axis).
order (int) – Interpolation order to use for resizing (0 = nearest neighbor).
input_key (str) – Key to access the input segmentation in data_dict.
output_key (str) – Key under which to store the output list of downsampled segmentations.
axes (tuple[int]) – Axes along which to apply the downsampling. If None, assumes axes are (2, 3, 4), i.e., skips batch and channel.
- __init__(ds_scales: list | tuple, order: int = 0, input_key: str = 'seg', output_key: str = 'seg', axes: tuple[int] | None = None)[source]¶
Transform that generates downsampled versions of a segmentation map for deep supervision.
This transform stores the results in data_dict[output_key] as a list of segmentations, each scaled according to a corresponding entry in ds_scales.
Each entry in ds_scales specified one deep supervision output and its resolution relative to the original data, for example 0.25 specifies 1/4 of the original shape. ds_scales can also be a tuple of tuples, for example ((1, 1, 1), (0.5, 0.5, 0.5)) to specify the downsampling for each axis independently
- Parameters:
ds_scales (list or tuple) – Scaling factors per deep supervision level. Each entry can be a float (same scaling for all axes) or a tuple of floats (individual scaling per axis).
order (int, default=0) – Interpolation order to use for resizing (0 = nearest neighbor).
input_key (str, default="seg") – Key to access the input segmentation in data_dict.
output_key (str, default="seg") – Key under which to store the output list of downsampled segmentations.
axes (tuple of int, optional) – Axes along which to apply the downsampling. If None, assumes axes are (2, 3, 4), i.e., skips batch and channel.
- class biom3d.datasets.semseg_batchgen.MTBatchGenDataLoader(*args: Any, **kwargs: Any)[source]¶
Multi-threaded data loader for efficient data augmentation and loading.
- Variables:
length (int) – Number of batches.
- __init__(img_path: str, msk_path: str, patch_size: Iterable[int], batch_size: int, nbof_steps: int, fg_path: str | None = None, folds_csv: str | None = None, fold: int = 0, val_split: float = 0.25, train: bool = True, load_data: bool = False, fg_rate: float = 0.33, num_threads_in_mt: int = 12, **kwargs)[source]¶
Multi-threaded data loader for efficient data augmentation and loading.
- Parameters:
img_path (str) – Path to a collection containing the images.
msk_path (str) – Path to a collection containing the masks.
patch_size (iterable of int) – The size of the patches to be extracted.
batch_size (int) – Size of the batches.
nbof_steps (int) – Number of steps per epoch.
fg_path (str, optional) – Path to a collection containing foreground information. For the moment it is not optional (need to fix that).
folds_csv (str, optional) – CSV file containing fold information for dataset splitting.
fold (int, default=0) – Current fold number for training/validation splitting.
val_split (float, default=0.25) – Proportion of data to be used for validation.
train (bool, default=True) – If True, use the dataset for training; otherwise, use it for validation.
load_data (bool, default=False) – If True, loads the entire dataset into computer memory.
fg_rate (float, default=0.33) – Foreground rate for cropping.
num_threads_in_mt (int, default=12) – Number of threads in multi-threaded augmentation.
**kwargs – Just to handle other parameters.
- Raises:
ValueError: – If fg_path is None
- class biom3d.datasets.semseg_batchgen.RandomCropAndPadTransform(*args: Any, **kwargs: Any)[source]¶
BatchGenerator transform for random cropping and padding.
- Variables:
data_key (str) – Key used to access data in dictionary.
label_key (str) – Key used to access label in dictionary.
fg_rate (float) – Foreground rate, probability of focusing crop on foreground.
crop_size (Iterable[int]) – Size of the crop.
- __init__(crop_size: Iterable[int], fg_rate: float = 0.33, data_key: str = 'data', label_key: str = 'seg')[source]¶
Batch generator transform for random cropping and padding.
- Parameters:
crop_size (iterable of int) – Size of the crop.
fg_rate (float, default=0.33) – Probability of focusing the crop on the foreground.
data_key (str, default="data") – Key for the data in the data dictionary.
label_key (str, default="seg") – Key for the label in the data dictionary.
- biom3d.datasets.semseg_batchgen.centered_crop(img: ndarray, msk: ndarray, center: Iterable[int], crop_shape: Iterable[int], margin: Iterable[float] = array([0., 0., 0.])) tuple[ndarray, ndarray][source]¶
Do a crop, forcing the location voxel to be located in the center of the crop.
- Parameters:
img (numpy.ndarray) – Image data.
msk (numpy.ndarray) – Mask data.
center (iterable of int) – Center voxel location for cropping.
crop_shape (iterable of int) – Shape of the crop.
margin (iterable of float, default=np.zeros(3)) – Margin around the center location.
- Returns:
crop_img (numpy.ndarray) – The cropped image, centered around center.
crop_msk (numpy.ndarray) – The cropped mask, centered around center.
- biom3d.datasets.semseg_batchgen.centered_pad(img: ndarray, final_size: ndarray, msk: ndarray | None = None) ndarray | tuple[ndarray, ndarray][source]¶
Centered pad an img and msk to fit the final_size.
- Parameters:
img (numpy.ndarray) – Image data.
final_size (array_like) – Final size after padding.
msk (numpy.ndarray, optional) – Mask data.
- Returns:
pad_img (numpy.ndarray) – Padded image.
pad_mask (numpy.ndarray, optional) – Padded image
- biom3d.datasets.semseg_batchgen.configure_rotation_dummy_da_mirroring_and_inital_patch_size(patch_size: Iterable[int]) tuple[dict[str, tuple[float, float]], bool, ndarray, tuple[int, ...]][source]¶
Configure rotation parameters, dummy 2D data augmentation, mirroring axes, and compute the initial patch size.
This function is stupid and certainly one of the weakest spots of this implementation. Not entirely sure how we can fix it.
- Parameters:
patch_size (iterabloe of int) – Patch size as a tuple, array, list,…
- Raises:
RuntimeError: – If patch_size not in 2 or 3 dimension
- Returns:
rotation_for_DA (dict of str to tuple of float) – A rotation for data augmentation.
do_dummy_2d_data_aug (bool) – Whether a dummy 2d data augmentation has been done
initial_patch_size (numpy.ndarray) – Path to foregrounds output collection.
- biom3d.datasets.semseg_batchgen.foreground_crop(img: ndarray, msk: ndarray, final_size: Iterable[int], fg_margin: Iterable[float], fg: dict[int, ndarray] | None = None, use_softmax: bool = True) tuple[ndarray, ndarray][source]¶
Do a foreground crop.
- Parameters:
img (numpy.ndarray) – Image data.
msk (numpy.ndarray) – Mask data.
final_size (iterable of int) – Final size of the cropped image and mask.
fg_margin (iterable of float) – Margin around the foreground location.
fg (dict of int to numpy.ndarray, optional) – Foreground information.
use_softmax (bool, default=True) – If True, assumes softmax activation.
- Returns:
img (numpy.ndarray) – Cropped image data, focused on the foreground region.
msk (numpy.ndarray) – Cropped mask data, corresponding to the cropped image region.
- biom3d.datasets.semseg_batchgen.get_bbox(patch_size: Iterable[int], final_patch_size: Iterable[int], annotated_classes_key: Hashable, data_shape: ndarray, force_fg: bool, class_locations: dict | None, overwrite_class: int | tuple[int, ...] | None = None, verbose: bool = False) tuple[list[int], list[int]][source]¶
Compute bounding box coordinates for cropping a patch from the data, optionally focusing on foreground regions.
- Parameters:
patch_size (iterable of int) – Desired patch size to crop (dimensions).
final_patch_size (iterable of int) – Current size of the patch after any previous cropping or resizing.
annotated_classes_key (hashable) – Key identifying the annotated class in class_locations.
data_shape (numpy.ndarray) – Shape of the full data volume or image from which the patch is cropped.
force_fg (bool) – If True, ensures the patch contains at least one voxel of foreground classes.
class_locations (dict or None) – Dictionary mapping class labels (int or tuple) to lists/arrays of voxel coordinates for that class. Required if force_fg is True.
overwrite_class (int or tuple of int, optional) – If set, forces the patch to focus on this class instead of randomly selected foreground class.
verbose (bool, default=False) – If True, prints diagnostic messages.
- Raises:
AssertionError: – If class_locations is None and force_fg is True. Or overwrite_class not in class_locations
- Returns:
bbox_lbs (list of int) – Lower bounds (start indices) of the bounding box along each dimension.
bbox_ubs (list of int) – Upper bounds (end indices) of the bounding box along each dimension.
Notes
The function calculates how much padding is needed if final_patch_size is smaller than patch_size.
If force_fg is True, it attempts to center the bounding box on a randomly selected voxel of a foreground class.
If no foreground voxel is found, it falls back to random cropping.
- biom3d.datasets.semseg_batchgen.get_patch_size(final_patch_size: list[int] | tuple[int] | ndarray, rot_x: float | tuple[float] | list[float], rot_y: float | tuple[float] | list[float], rot_z: float | tuple[float] | list[float], scale_range: tuple[float] | list[float]) ndarray[source]¶
Compute the required patch size to accommodate rotation and scaling augmentations.
This function determines the maximum patch size needed after applying possible rotations and scaling to ensure that the original patch fits entirely within the transformed space (i.e., no cropping due to rotation).
- Parameters:
final_patch_size (list/tuple/ndarray of int) – The desired final patch size before any augmentations. Should be 2D (for 2D images) or 3D (for volumetric data).
rot_x (float or tuple/list of float) – Rotation angle(s) in radians around the x-axis. If a tuple or list, the maximum absolute value is used.
rot_y (float or tuple/list of float) – Rotation angle(s) in radians around the y-axis. Ignored if input is 2D.
rot_z (float or tuple/list of float) – Rotation angle(s) in radians around the z-axis. Ignored if input is 2D.
scale_range (tuple or list of float) – Range of possible scaling factors applied during augmentation. The minimum value is used to compute the worst-case required patch size.
- Returns:
final_shape – The adjusted patch size that ensures the transformed patch still contains the original field of view, accounting for rotation and scaling.
- Return type:
numpy.ndarray of int
Notes
The maximum allowed rotation is clipped to 90° (π/2 radians) for numerical stability.
The patch size is increased to accommodate potential rotation “corners” that extend beyond the original bounds.
- biom3d.datasets.semseg_batchgen.get_training_transforms(aug_patch_size: ndarray | tuple[int], patch_size: ndarray | tuple[int], fg_rate: float, rotation_for_DA: dict, deep_supervision_scales: list | tuple | None, mirror_axes: tuple[int, ...], handler: DataHandler, do_dummy_2d_data_aug: bool, order_resampling_data: int = 3, order_resampling_seg: int = 1, border_val_seg: int = -1, use_data_reader: bool = True) batchgenerators.transforms.abstract_transforms.AbstractTransform[source]¶
Create a composed transform pipeline for training data augmentation, following the nnU-Net conventions.
- Parameters:
aug_patch_size (numpy.ndarray or tuple of int) – Size of the patch used during augmentation (may be larger than patch_size).
patch_size (numpy.ndarray or tuple of int) – Final cropped patch size used for training.
fg_rate (float) – Probability of cropping patches that contain foreground voxels.
rotation_for_DA (dict) – Dictionary specifying rotation angles for data augmentation. Should contain keys ‘x’, ‘y’, and ‘z’.
deep_supervision_scales (list, tuple or None) – List of scales for deep supervision. Used to downsample segmentation masks accordingly.
mirror_axes (tuple[int, ...]) – Axes along which to apply mirroring (e.g., (0, 1, 2)).
handler (DataHandler) – DataHandler used to load images. Used only if use_data_reader is True
do_dummy_2d_data_aug (bool) – If True, applies dummy 2D data augmentation (by slicing 3D volumes).
order_resampling_data (int, default=3) – Interpolation order used for resampling image data.
order_resampling_seg (int, default=1) – Interpolation order used for resampling segmentation masks.
border_val_seg (int, default=-1) – Border value used for segmentation padding.
use_data_reader (bool, default=True) – If True, includes the DataReader transform in the pipeline.
- Returns:
A composed transformation pipeline to be applied to training data.
- Return type:
AbstractTransform
- biom3d.datasets.semseg_batchgen.get_validation_transforms(patch_size: ndarray | tuple[int], fg_rate: float, handler: DataHandler, deep_supervision_scales: list | tuple | None = None, use_data_reader: bool = True) batchgenerators.transforms.abstract_transforms.AbstractTransform[source]¶
Create a composed transformation pipeline for validation data, following the nnU-Net conventions.
- Parameters:
patch_size (numpy.ndarray or tuple of int) – Size of the patch used for cropping and padding.
fg_rate (float) – Probability of focusing on foreground regions when cropping.
handler (DataHandler) – DataHandler used to load images. Used only if use_data_reader is True
deep_supervision_scales (list, tuple or None, optional) – List of scales for deep supervision. If provided, segmentation masks will be downsampled accordingly.
use_data_reader (bool, default=True) – If True, includes the DataReader transform to load data from disk.
- Returns:
A composed transform pipeline to be applied during validation.
- Return type:
AbstractTransform
- biom3d.datasets.semseg_batchgen.imread(handler: DataHandler, img: str, msk: str, loc: str | None = None, is3d: bool = True) tuple[ndarray, ndarray, ndarray | None][source]¶
Read all data with the provided DataHandler.
- Parameters:
handler (DataHandler) – The DataHandler used to read data.
img (str) – The path to the image.
msk (str) – The path to the mask.
loc (str, optional) – The path to the foreground. If None, no foreground will be returned.
is3d (bool, default=True) – If image is in 3D
- Returns:
img (numpy.ndarray) – The image.
msk (numpy.ndarray) – The mask.
fg (numpy.ndarray, optional) – The foreground, or None.
- biom3d.datasets.semseg_batchgen.located_crop(img: ndarray, msk: ndarray, location: Iterable[int], crop_shape: Iterable[int], margin: Iterable[float] = array([0., 0., 0.])) tuple[ndarray, ndarray][source]¶
Do a crop, forcing the location voxel to be located in the crop.
- Parameters:
img (numpy.ndarray) – Image data.
msk (numpy.ndarray) – Mask data.
location (iterable of int) – Specific voxel location to include in the crop.
crop_shape (iterable of int) – Shape of the crop.
margin (iterable of float, default=np.zeros(3)) – Margin around the location.
- Returns:
crop_img (numpy.ndarray) – Cropped image data, containing the specified location voxel within the crop.
crop_msk (numpy.ndarray) – Cropped mask data, corresponding to the cropped image region.
- class biom3d.datasets.semseg_batchgen.nnUNetRandomCropAndPadTransform(*args: Any, **kwargs: Any)[source]¶
Random cropping and padding transform for nnU-Net-style data augmentation.
Applies random crop centered around a foreground voxel with a certain probability (fg_rate), and pads the data and label to the desired augmented crop size.
:ivar Iterable[int] aug_crop_size : Final shape after cropping and padding (target shape). :ivar Iterable[int] crop_size : Crop size for network input (may differ from aug_crop_size). :ivar float fg_rate : Probability of forcing the crop to focus on the foreground class. :ivar str data_key : Key for the input data in the data dictionary. :ivar str label_key : Key for the segmentation labels in the data dictionary. :ivar str class_loc_key : Key for the precomputed voxel locations per class in the data dictionary.
- __init__(aug_crop_size: Iterable[int], crop_size: Iterable[int], fg_rate: float = 0.33, data_key: str = 'data', label_key: str = 'seg', class_loc_key: str = 'loc')[source]¶
Random cropping and padding transform for nnU-Net-style data augmentation.
- Parameters:
aug_crop_size (iterable of int) – Final shape after cropping and padding (target shape).
crop_size (iterable of int) – Crop size for network input (may differ from aug_crop_size).
fg_rate (float, default=0.33) – Probability of forcing the crop to focus on the foreground class.
data_key (str, default="data") – Key for the input data in the data dictionary.
label_key (str, default="seg") – Key for the segmentation labels in the data dictionary.
class_loc_key (str, default="loc") – Key for the precomputed voxel locations per class in the data dictionary.
- biom3d.datasets.semseg_batchgen.random_crop(img: ndarray, msk: ndarray, crop_shape: Iterable[int]) tuple[ndarray, ndarray][source]¶
Randomly crop a portion of size prop of the original image size.
- Parameters:
img (numpy.ndarray) – Image data.
msk (numpy.ndarray) – Mask data.
crop_shape (array_like) – Shape of the crop.
- Raises:
AssertionError: – If img and crop_shape doesn’t have the same number of dimensions.
- Returns:
crop_img (numpy.ndarray) – Cropped image data.
crop_msk (numpy.ndarray) – Cropped mask data.
- biom3d.datasets.semseg_batchgen.random_crop_pad(img: ndarray, msk: ndarray, final_size: Iterable[int], fg_rate: float = 0.33, fg_margin: Iterable[float] = array([0., 0., 0.]), fg: dict[str, ndarray] | None = None, use_softmax: bool = True) tuple[ndarray, ndarray][source]¶
Random crop and pad if needed.
- Parameters:
img (numpy.ndarray) – Image data.
msk (numpy.ndarray) – Mask data.
final_size (iterable of int) – Final size after cropping and padding.
fg_rate (float, default=0.33) – Probability of focusing the crop on the foreground.
fg_margin (iterable of float, optional) – Margin around the foreground location.
fg (dict of int to numpy.ndarray, optional) – Foreground information.
use_softmax (bool, default=True) – If True, assumes softmax activation; otherwise sigmoid is used.
- Returns:
img (numpy.ndarray) – Cropped and padded image data.
msk (numpy.ndarray) – Cropped and padded mask data.