Torchio¶
Torchio dataloader, aimed principally for data augmentation.
Dataset primitives for 3D segmentation dataset. Solution: patch approach with the whole dataset into memory, based on Torchio, fastest dataloading method so far.
- class biom3d.datasets.semseg_torchio.LabelToBool(label_name: str)[source]¶
Transform to convert label data to bool type.
- Variables:
label_name (str) – Name of the label to be transformed.
- class biom3d.datasets.semseg_torchio.LabelToFloat(label_name: str)[source]¶
Transform to convert label data to float type.
- Variables:
label_name (str) – Name of the label to be transformed.
- class biom3d.datasets.semseg_torchio.LabelToLong(label_name: str)[source]¶
Transform to convert label data to long type.
- Variables:
label_name (str) – Name of the label to be transformed.
- class biom3d.datasets.semseg_torchio.RandomCropOrPad(*args: Any, **kwargs: Any)[source]¶
Randomly crop a subject, and pad it if needed.
- Variables:
patch_size (numpy.ndarray[np.uint16])
fg_rate (float) – Foreground rate, if > 0, force the use of foreground.
label_name (str) – Name of the label image in the tio.Subject.
start_fg_idx (int) – Starting index in foreground. Determined by softmax use.
- __init__(patch_shape: ndarray, fg_rate: float = 0, label_name: str = None, use_softmax: bool = True, **kwargs)[source]¶
Randomly crop a subject, and pad it if needed.
Adapted from tio.data.sampler.PatchSampler.
- Parameters:
patch_size (numpy.ndarray) – Size of a patch.
fg_rate (int, default=0) – Foreground rate, if > 0, force the use of foreground. Label name must be specified.
label_name (str, default=None) – Used with the foreground rate. Name of the label image in the tio.Subject.
use_softmax (boolean, default=True) – Used with the foreground rate to know if the background should be removed.
**kwargs (dict) – Additional keyword arguments.
- Raises:
ValueError – If a dimension of patch_size in <1 or not an int (or np.integer)
- apply_transform(subject: torchio.Subject) torchio.Subject[source]¶
Apply patch sampling to the subject, with optional foreground enforcement.
A patch is randomly sampled from the subject. If fg_rate > 0, a random foreground voxel may be used to center the patch, based on the label map. Otherwise, a random valid location is used. If the patch is smaller than patch_size, symmetric padding is applied.
Adapted from tio.data.sampler.UniformSampler
- Parameters:
subject (Subject) – The subject to transform.
- Returns:
transformed – The subject containing the sampled and padded patch.
- Return type:
Subject
- crop(subject: torchio.Subject, index_ini: torchio.types.TypeTripletInt, patch_size: torchio.types.TypeTripletInt) torchio.Subject[source]¶
Crop a patch from the subject at a given position and size.
Copied from
tio.data.sampler.PatchSampler.- Parameters:
subject (Subject) – The subject to crop.
index_ini (TypeTripletInt) – The starting index (x, y, z) of the crop.
patch_size (TypeTripletInt) – The size of the patch to extract (dx, dy, dz).
- Returns:
cropped_subject – The cropped subject with the patch and an updated LOCATION attribute.
- Return type:
Subject
- extract_patch(subject: torchio.Subject, index_ini: torchio.types.TypeTripletInt) torchio.Subject[source]¶
Extract a patch from the given subject starting at a specified index.
- Parameters:
subject – Subject The subject to extract the patch from.
index_ini – TypeTripletInt The starting index (x, y, z) of the patch.
- Returns:
- Subject
The extracted patch as a new subject.
- Return type:
cropped_subject
- class biom3d.datasets.semseg_torchio.TorchIOReaderWrapper(handler: DataHandler)[source]¶
A wrapper class so TorchIO can use a DataHandler.
- Variables:
handler (DataHandler) – DataHandler used to read data.
- class biom3d.datasets.semseg_torchio.TorchioDataset(*args: Any, **kwargs: Any)[source]¶
Custom dataset similar to torchio.SubjectsDataset but supports an unlimited number of steps (batches) per epoch.
Handles loading of images, masks, and foreground data, train/validation splitting, optional in-memory data loading, and specific data augmentations.
- Variables:
img_path (str) – Path to the collection containing image files.
msk_path (str) – Path to the collection containing mask files.
fg_path (Optional[str]) – Path to the collection containing foreground data (optional).
batch_size (int) – Batch size for sampling.
patch_size (numpy.ndarray) – Size of the patches to extract.
aug_patch_size (Optional[numpy.ndarray]) – Size of patches used for augmentation (optional). Can be larger than patch_size
nbof_steps (int) – Number of steps (batches) per epoch.
load_data (bool) – Whether to load all data into memory.
handler (DataHandler) – Data handler for loading images and masks.
train (bool) – Indicates if the dataset is used for training (True) or validation (False).
fnames (list[str]) – List of filenames used depending on training or validation mode.
subjects_list (list[Subject]) – List of TorchIO Subjects created from the files.
use_aug (bool) – Whether data augmentations are enabled.
fg_rate (float) – Foreground inclusion rate to force foreground sampling in patches.
use_softmax (bool) – Whether to use softmax activation; if False, sigmoid is used.
batch_idx (int) – Current batch index for internal tracking.
- __init__(img_path: str, msk_path: str, batch_size: int, patch_size: ndarray, nbof_steps: int, fg_path: str | None = None, folds_csv: str | None = None, fold: int = 0, val_split: float = 0.25, train: bool = True, use_aug: bool = True, aug_patch_size: ndarray | None = None, fg_rate: float = 0.33, load_data: bool = False, use_softmax: bool = True)[source]¶
Similar as torchio.SubjectsDataset but can be use with an unlimited amount of steps.
- Parameters:
img_path (str) – Path to collection containing the image files.
msk_path (str) – Path to collection containing the mask files.
batch_size (int) – Batch size for dataset sampling.
patch_size (numpy.ndarray) – Size of the patches to be used.
nbof_steps (int) – Number of steps (batches) per epoch.
fg_path (str, optional) – Path to collection containing foreground information.
folds_csv (str, optional) – CSV file containing fold information for dataset splitting.
fold (int, default=0) – The current fold number for training/validation splitting.
val_split (float, default=0.25) – Proportion of data to be used for validation.
train (bool, default=True) – If True, use the dataset for training; otherwise, use it for validation.
use_aug (bool, default=True) – If True, apply data augmentation.
aug_patch_size (numpy.ndarray, optional) – Patch size to use for augmented patches.
fg_rate (float, default=0.33) – Foreground rate, used to force foreground inclusion in patches. If > 0, force the use of foreground, needs to run some pre-computations (note: better use the foreground scheduler)
load_data (bool, default=False) – If True, loads the all dataset into computer memory (faster but more memory expensive).
use_softmax (bool, default=True) – If True, use softmax activation; otherwise, sigmoid is used.