topobench.data.preprocessor package#
Submodules#
topobench.data.preprocessor.preprocessor module#
Preprocessor for datasets.
- class topobench.data.preprocessor.preprocessor.PreProcessor(dataset, data_dir, transforms_config=None, **kwargs)[source]#
Bases:
InMemoryDataset
Preprocessor for datasets.
- Parameters:
- datasetlist
List of data objects.
- data_dirstr
Path to the directory containing the data.
- transforms_configDictConfig, optional
Configuration parameters for the transforms (default: None).
- **kwargsoptional
Optional additional arguments.
- instantiate_pre_transform(data_dir, transforms_config) Compose [source]#
Instantiate the pre-transforms.
- Parameters:
- data_dirstr
Path to the directory containing the data.
- transforms_configDictConfig
Configuration parameters for the transforms.
- Returns:
- torch_geometric.transforms.Compose
Pre-transform object.
- load(path: str) None [source]#
Load the dataset from the file path path.
- Parameters:
- pathstr
The path to the processed data.
- load_dataset_splits(split_params) tuple[DataloadDataset, DataloadDataset | None, DataloadDataset | None] [source]#
Load the dataset splits.
- Parameters:
- split_paramsdict
Parameters for loading the dataset splits.
- Returns:
- tuple
A tuple containing the train, validation, and test datasets.
- property processed_dir: str#
Return the path to the processed directory.
- Returns:
- str
Path to the processed directory.
- property processed_file_names: str#
Return the name of the processed file.
- Returns:
- str
Name of the processed file.
- set_processed_data_dir(pre_transforms_dict, data_dir, transforms_config) None [source]#
Set the processed data directory.
- Parameters:
- pre_transforms_dictdict
Dictionary containing the pre-transforms.
- data_dirstr
Path to the directory containing the data.
- transforms_configDictConfig
Configuration parameters for the transforms.
Module contents#
Init file for Preprocessor module.
- class topobench.data.preprocessor.PreProcessor(dataset, data_dir, transforms_config=None, **kwargs)[source]#
Bases:
InMemoryDataset
Preprocessor for datasets.
- Parameters:
- datasetlist
List of data objects.
- data_dirstr
Path to the directory containing the data.
- transforms_configDictConfig, optional
Configuration parameters for the transforms (default: None).
- **kwargsoptional
Optional additional arguments.
- instantiate_pre_transform(data_dir, transforms_config) Compose [source]#
Instantiate the pre-transforms.
- Parameters:
- data_dirstr
Path to the directory containing the data.
- transforms_configDictConfig
Configuration parameters for the transforms.
- Returns:
- torch_geometric.transforms.Compose
Pre-transform object.
- load(path: str) None [source]#
Load the dataset from the file path path.
- Parameters:
- pathstr
The path to the processed data.
- load_dataset_splits(split_params) tuple[DataloadDataset, DataloadDataset | None, DataloadDataset | None] [source]#
Load the dataset splits.
- Parameters:
- split_paramsdict
Parameters for loading the dataset splits.
- Returns:
- tuple
A tuple containing the train, validation, and test datasets.
- property processed_dir: str#
Return the path to the processed directory.
- Returns:
- str
Path to the processed directory.
- property processed_file_names: str#
Return the name of the processed file.
- Returns:
- str
Name of the processed file.
- set_processed_data_dir(pre_transforms_dict, data_dir, transforms_config) None [source]#
Set the processed data directory.
- Parameters:
- pre_transforms_dictdict
Dictionary containing the pre-transforms.
- data_dirstr
Path to the directory containing the data.
- transforms_configDictConfig
Configuration parameters for the transforms.