topobench.data.preprocessor package#

Submodules#

topobench.data.preprocessor.preprocessor module#

Preprocessor for datasets.

class topobench.data.preprocessor.preprocessor.PreProcessor(dataset, data_dir, transforms_config=None, **kwargs)[source]#

Bases: InMemoryDataset

Preprocessor for datasets.

Parameters:
datasetlist

List of data objects.

data_dirstr

Path to the directory containing the data.

transforms_configDictConfig, optional

Configuration parameters for the transforms (default: None).

**kwargsoptional

Optional additional arguments.

instantiate_pre_transform(data_dir, transforms_config) Compose[source]#

Instantiate the pre-transforms.

Parameters:
data_dirstr

Path to the directory containing the data.

transforms_configDictConfig

Configuration parameters for the transforms.

Returns:
torch_geometric.transforms.Compose

Pre-transform object.

load(path: str) None[source]#

Load the dataset from the file path path.

Parameters:
pathstr

The path to the processed data.

load_dataset_splits(split_params) tuple[DataloadDataset, DataloadDataset | None, DataloadDataset | None][source]#

Load the dataset splits.

Parameters:
split_paramsdict

Parameters for loading the dataset splits.

Returns:
tuple

A tuple containing the train, validation, and test datasets.

process() None[source]#

Method that processes the data.

property processed_dir: str#

Return the path to the processed directory.

Returns:
str

Path to the processed directory.

property processed_file_names: str#

Return the name of the processed file.

Returns:
str

Name of the processed file.

save_transform_parameters() None[source]#

Save the transform parameters.

set_processed_data_dir(pre_transforms_dict, data_dir, transforms_config) None[source]#

Set the processed data directory.

Parameters:
pre_transforms_dictdict

Dictionary containing the pre-transforms.

data_dirstr

Path to the directory containing the data.

transforms_configDictConfig

Configuration parameters for the transforms.

Module contents#

Init file for Preprocessor module.

class topobench.data.preprocessor.PreProcessor(dataset, data_dir, transforms_config=None, **kwargs)[source]#

Bases: InMemoryDataset

Preprocessor for datasets.

Parameters:
datasetlist

List of data objects.

data_dirstr

Path to the directory containing the data.

transforms_configDictConfig, optional

Configuration parameters for the transforms (default: None).

**kwargsoptional

Optional additional arguments.

instantiate_pre_transform(data_dir, transforms_config) Compose[source]#

Instantiate the pre-transforms.

Parameters:
data_dirstr

Path to the directory containing the data.

transforms_configDictConfig

Configuration parameters for the transforms.

Returns:
torch_geometric.transforms.Compose

Pre-transform object.

load(path: str) None[source]#

Load the dataset from the file path path.

Parameters:
pathstr

The path to the processed data.

load_dataset_splits(split_params) tuple[DataloadDataset, DataloadDataset | None, DataloadDataset | None][source]#

Load the dataset splits.

Parameters:
split_paramsdict

Parameters for loading the dataset splits.

Returns:
tuple

A tuple containing the train, validation, and test datasets.

process() None[source]#

Method that processes the data.

property processed_dir: str#

Return the path to the processed directory.

Returns:
str

Path to the processed directory.

property processed_file_names: str#

Return the name of the processed file.

Returns:
str

Name of the processed file.

save_transform_parameters() None[source]#

Save the transform parameters.

set_processed_data_dir(pre_transforms_dict, data_dir, transforms_config) None[source]#

Set the processed data directory.

Parameters:
pre_transforms_dictdict

Dictionary containing the pre-transforms.

data_dirstr

Path to the directory containing the data.

transforms_configDictConfig

Configuration parameters for the transforms.