topobench.data.preprocessor package#

Init file for Preprocessor module.

class PreProcessor(dataset, data_dir, transforms_config=None, **kwargs)#

Bases: InMemoryDataset

Preprocessor for datasets.

Parameters:
datasetlist

List of data objects.

data_dirstr

Path to the directory containing the data.

transforms_configDictConfig, optional

Configuration parameters for the transforms (default: None).

**kwargsoptional

Optional additional arguments.

__init__(dataset, data_dir, transforms_config=None, **kwargs)#
instantiate_pre_transform(data_dir, transforms_config)#

Instantiate the pre-transforms.

Parameters:
data_dirstr

Path to the directory containing the data.

transforms_configDictConfig

Configuration parameters for the transforms.

Returns:
torch_geometric.transforms.Compose

Pre-transform object.

load(path)#

Load the dataset from the file path path.

Parameters:
pathstr

The path to the processed data.

load_dataset_splits(split_params)#

Load the dataset splits.

Parameters:
split_paramsdict

Parameters for loading the dataset splits.

Returns:
tuple

A tuple containing the train, validation, and test datasets.

process()#

Method that processes the data.

save_transform_parameters()#

Save the transform parameters.

set_processed_data_dir(pre_transforms_dict, data_dir, transforms_config)#

Set the processed data directory.

Parameters:
pre_transforms_dictdict

Dictionary containing the pre-transforms.

data_dirstr

Path to the directory containing the data.

transforms_configDictConfig

Configuration parameters for the transforms.

property processed_dir: str#

Return the path to the processed directory.

Returns:
str

Path to the processed directory.

property processed_file_names: str#

Return the name of the processed file.

Returns:
str

Name of the processed file.

Submodules#