Torchio

Torchio dataloader, aimed principally for data augmentation.

Dataset primitives for 3D segmentation dataset. Solution: patch approach with the whole dataset into memory, based on Torchio, fastest dataloading method so far.

class biom3d.datasets.semseg_torchio.LabelToBool(label_name: str)[source]

Transform to convert label data to bool type.

Variables:

label_name (str) – Name of the label to be transformed.

__init__(label_name: str)[source]

Transform to convert label data to bool type.

Parameters:

label_name (str) – Name of the label to be transformed.

class biom3d.datasets.semseg_torchio.LabelToFloat(label_name: str)[source]

Transform to convert label data to float type.

Variables:

label_name (str) – Name of the label to be transformed.

__init__(label_name: str)[source]

Transform to convert label data to float type.

Parameters:

label_name (str) – Name of the label to be transformed.

class biom3d.datasets.semseg_torchio.LabelToLong(label_name: str)[source]

Transform to convert label data to long type.

Variables:

label_name (str) – Name of the label to be transformed.

__init__(label_name: str)[source]

Transform to convert label data to long type.

Parameters:

label_name (str) – Name of the label to be transformed.

class biom3d.datasets.semseg_torchio.RandomCropOrPad(*args: Any, **kwargs: Any)[source]

Randomly crop a subject, and pad it if needed.

Variables:
  • patch_size (numpy.ndarray[np.uint16])

  • fg_rate (float) – Foreground rate, if > 0, force the use of foreground.

  • label_name (str) – Name of the label image in the tio.Subject.

  • start_fg_idx (int) – Starting index in foreground. Determined by softmax use.

__init__(patch_shape: ndarray, fg_rate: float = 0, label_name: str = None, use_softmax: bool = True, **kwargs)[source]

Randomly crop a subject, and pad it if needed.

Adapted from tio.data.sampler.PatchSampler.

Parameters:
  • patch_size (numpy.ndarray) – Size of a patch.

  • fg_rate (int, default=0) – Foreground rate, if > 0, force the use of foreground. Label name must be specified.

  • label_name (str, default=None) – Used with the foreground rate. Name of the label image in the tio.Subject.

  • use_softmax (boolean, default=True) – Used with the foreground rate to know if the background should be removed.

  • **kwargs (dict) – Additional keyword arguments.

Raises:

ValueError – If a dimension of patch_size in <1 or not an int (or np.integer)

apply_transform(subject: torchio.Subject) torchio.Subject[source]

Apply patch sampling to the subject, with optional foreground enforcement.

A patch is randomly sampled from the subject. If fg_rate > 0, a random foreground voxel may be used to center the patch, based on the label map. Otherwise, a random valid location is used. If the patch is smaller than patch_size, symmetric padding is applied.

Adapted from tio.data.sampler.UniformSampler

Parameters:

subject (Subject) – The subject to transform.

Returns:

transformed – The subject containing the sampled and padded patch.

Return type:

Subject

crop(subject: torchio.Subject, index_ini: torchio.types.TypeTripletInt, patch_size: torchio.types.TypeTripletInt) torchio.Subject[source]

Crop a patch from the subject at a given position and size.

Copied from tio.data.sampler.PatchSampler.

Parameters:
  • subject (Subject) – The subject to crop.

  • index_ini (TypeTripletInt) – The starting index (x, y, z) of the crop.

  • patch_size (TypeTripletInt) – The size of the patch to extract (dx, dy, dz).

Returns:

cropped_subject – The cropped subject with the patch and an updated LOCATION attribute.

Return type:

Subject

extract_patch(subject: torchio.Subject, index_ini: torchio.types.TypeTripletInt) torchio.Subject[source]

Extract a patch from the given subject starting at a specified index.

Parameters:
  • subject – Subject The subject to extract the patch from.

  • index_ini – TypeTripletInt The starting index (x, y, z) of the patch.

Returns:

Subject

The extracted patch as a new subject.

Return type:

cropped_subject

class biom3d.datasets.semseg_torchio.TorchIOReaderWrapper(handler: DataHandler)[source]

A wrapper class so TorchIO can use a DataHandler.

Variables:

handler (DataHandler) – DataHandler used to read data.

__init__(handler: DataHandler)[source]

Initialize the wrapper.

Paramters

handler: DataHandler

DataHandler used to read data.

class biom3d.datasets.semseg_torchio.TorchioDataset(*args: Any, **kwargs: Any)[source]

Custom dataset similar to torchio.SubjectsDataset but supports an unlimited number of steps (batches) per epoch.

Handles loading of images, masks, and foreground data, train/validation splitting, optional in-memory data loading, and specific data augmentations.

Variables:
  • img_path (str) – Path to the collection containing image files.

  • msk_path (str) – Path to the collection containing mask files.

  • fg_path (Optional[str]) – Path to the collection containing foreground data (optional).

  • batch_size (int) – Batch size for sampling.

  • patch_size (numpy.ndarray) – Size of the patches to extract.

  • aug_patch_size (Optional[numpy.ndarray]) – Size of patches used for augmentation (optional). Can be larger than patch_size

  • nbof_steps (int) – Number of steps (batches) per epoch.

  • load_data (bool) – Whether to load all data into memory.

  • handler (DataHandler) – Data handler for loading images and masks.

  • train (bool) – Indicates if the dataset is used for training (True) or validation (False).

  • fnames (list[str]) – List of filenames used depending on training or validation mode.

  • subjects_list (list[Subject]) – List of TorchIO Subjects created from the files.

  • use_aug (bool) – Whether data augmentations are enabled.

  • fg_rate (float) – Foreground inclusion rate to force foreground sampling in patches.

  • use_softmax (bool) – Whether to use softmax activation; if False, sigmoid is used.

  • batch_idx (int) – Current batch index for internal tracking.

__init__(img_path: str, msk_path: str, batch_size: int, patch_size: ndarray, nbof_steps: int, fg_path: str | None = None, folds_csv: str | None = None, fold: int = 0, val_split: float = 0.25, train: bool = True, use_aug: bool = True, aug_patch_size: ndarray | None = None, fg_rate: float = 0.33, load_data: bool = False, use_softmax: bool = True)[source]

Similar as torchio.SubjectsDataset but can be use with an unlimited amount of steps.

Parameters:
  • img_path (str) – Path to collection containing the image files.

  • msk_path (str) – Path to collection containing the mask files.

  • batch_size (int) – Batch size for dataset sampling.

  • patch_size (numpy.ndarray) – Size of the patches to be used.

  • nbof_steps (int) – Number of steps (batches) per epoch.

  • fg_path (str, optional) – Path to collection containing foreground information.

  • folds_csv (str, optional) – CSV file containing fold information for dataset splitting.

  • fold (int, default=0) – The current fold number for training/validation splitting.

  • val_split (float, default=0.25) – Proportion of data to be used for validation.

  • train (bool, default=True) – If True, use the dataset for training; otherwise, use it for validation.

  • use_aug (bool, default=True) – If True, apply data augmentation.

  • aug_patch_size (numpy.ndarray, optional) – Patch size to use for augmented patches.

  • fg_rate (float, default=0.33) – Foreground rate, used to force foreground inclusion in patches. If > 0, force the use of foreground, needs to run some pre-computations (note: better use the foreground scheduler)

  • load_data (bool, default=False) – If True, loads the all dataset into computer memory (faster but more memory expensive).

  • use_softmax (bool, default=True) – If True, use softmax activation; otherwise, sigmoid is used.