topobench.data.loaders package#
Init file for load module.
- class ADMEDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad TDC ADME datasets with SMILES to graph conversion using OGB featurization.
This loader: 1. Loads ADME datasets from TDC (Therapeutics Data Commons) 2. Converts SMILES strings to PyG graphs using OGB’s standard featurization 3. Uses fixed scaffold splits from TDC 4. Returns graphs compatible with OGB molecular property prediction
- Node features (9-dimensional):
Atomic number
Chirality
Degree
Formal charge
Number of hydrogens
Number of radical electrons
Hybridization
Is aromatic
Is in ring
- Edge features (3-dimensional):
Bond type
Bond stereochemistry
Is conjugated
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the ADME dataset
data_type: Type of the dataset (e.g., “ADME”)
- __init__(parameters)#
- get_data_dir()#
Get the data directory.
- Returns:
- Path
The path to the dataset directory. Format: {root_data_dir}/{dataset_name}/. Example: data/graph/ADME/BBB_Martins/.
- load_dataset()#
Load the ADME dataset with predefined scaffold splits.
- Returns:
- InMemoryDataset
The dataset with converted graphs and predefined splits.
- Raises:
- RuntimeError
If dataset loading or SMILES conversion fails.
- ValueError
If invalid SMILES strings are encountered.
- ImportError
If PyTDC or rdkit (via ogb) are not installed.
- class AbstractLoader(parameters)#
Bases:
ABCAbstract class that provides an interface to load data.
- Parameters:
- parametersDictConfig
Configuration parameters.
- __init__(parameters)#
- get_data_dir()#
Get the data directory.
- Returns:
- Path
The path to the dataset directory.
- load(**kwargs)#
Load data.
- Parameters:
- **kwargsdict
Additional keyword arguments.
- Returns:
- tuple[torch_geometric.data.Data, str]
Tuple containing the loaded data and the data directory.
- abstractmethod load_dataset()#
Load data into a dataset.
- Returns:
- Union[torch_geometric.data.Dataset, torch.utils.data.Dataset]
The loaded dataset, which could be a PyG or PyTorch dataset.
- Raises:
- NotImplementedError
If the method is not implemented.
- class CitationHypergraphDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad Citation Hypergraph dataset with configurable parameters.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
other relevant parameters
- __init__(parameters)#
- load_dataset()#
Load the Citation Hypergraph dataset.
- Returns:
- CitationHypergraphDataset
The loaded Citation Hypergraph dataset with the appropriate data_dir.
- Raises:
- RuntimeError
If dataset loading fails.
- class GeometricShapesDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad GeometricShapes dataset.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
- __init__(parameters)#
- load_dataset()#
Load GeometricShapes dataset.
- Returns:
- Dataset
The loaded GeometricShapes dataset.
- Raises:
- RuntimeError
If dataset loading fails.
- class GraphUniverseDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad Graph Universe datasets.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
data_type: Type of the dataset (e.g., “graph_classification”)
- __init__(parameters)#
- load(**kwargs)#
Load data.
- Parameters:
- **kwargsdict
Additional keyword arguments.
- Returns:
- tuple[torch_geometric.data.Data, str]
Tuple containing the loaded data and the data directory.
- load_dataset()#
Load Graph Universe dataset.
- Returns:
- Dataset
The loaded Graph Universe dataset.
- Raises:
- RuntimeError
If dataset loading fails.
- class HeterophilousGraphDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad Heterophilous Graph datasets.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
data_type: Type of the dataset (e.g., “heterophilous”)
- __init__(parameters)#
- load_dataset()#
Load Heterophilous Graph dataset.
- Returns:
- Dataset
The loaded Heterophilous Graph dataset.
- Raises:
- RuntimeError
If dataset loading fails.
- class HypergraphDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad Citation Hypergraph dataset with configurable parameters.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
other relevant parameters
- __init__(parameters)#
- load_dataset()#
Load the Citation Hypergraph dataset.
- Returns:
- HypergraphDataset
The loaded Citation Hypergraph dataset with the appropriate data_dir.
- Raises:
- RuntimeError
If dataset loading fails.
- class MantraSimplicialDatasetLoader(parameters, **kwargs)#
Bases:
AbstractLoaderLoad Mantra dataset with configurable parameters.
Note: for the simplicial datasets it is necessary to include DatasetLoader into the name of the class!
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
other relevant parameters
- **kwargsdict
Additional keyword arguments.
- __init__(parameters, **kwargs)#
- load(**kwargs)#
Load the Mantra dataset.
- Parameters:
- **kwargsdict
Additional keyword arguments for dataset initialization.
- Returns:
- MantraDataset
The loaded Mantra dataset with the appropriate data_dir.
- Raises:
- RuntimeError
If dataset loading fails.
- load_dataset(**kwargs)#
Initialize the Mantra dataset.
- Parameters:
- **kwargsdict
Additional keyword arguments for dataset initialization.
- Returns:
- MantraDataset
The initialized dataset instance.
- class ManualGraphDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad manually provided graph datasets.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_name: Name of the dataset
data_dir: Root directory for data
- __init__(parameters)#
- get_data_dir()#
Get the data directory.
- Returns:
- Path
The path to the dataset directory.
- load_dataset()#
Load the manual graph dataset.
- Returns:
- DataloadDataset
The dataset object containing the manually loaded graph.
- class MoleculeDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad molecule datasets (ZINC and AQSOL) with predefined splits, or QM9.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
data_type: Type of the dataset (e.g., “molecule”)
qm9_target_index: (QM9 only) Which of the 19 regression targets to use (default 0).
- __init__(parameters)#
- get_data_dir()#
Get the data directory.
- Returns:
- Path
The path to the dataset directory.
- load_dataset()#
Load the molecule dataset with predefined splits.
- Returns:
- Dataset
The combined dataset with predefined splits.
- Raises:
- RuntimeError
If dataset loading fails.
- class OGBGDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad molecule datasets (molhiv, molpcba, ppa) with predefined splits.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
data_type: Type of the dataset (e.g., “molecule”)
- __init__(parameters)#
- get_data_dir()#
Get the data directory.
- Returns:
- Path
The path to the dataset directory.
- load_dataset()#
Load the molecule dataset with predefined splits.
- Returns:
- Dataset
The combined dataset with predefined splits.
- Raises:
- RuntimeError
If dataset loading fails.
- class PlanetoidDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad PLANETOID datasets.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
data_type: Type of the dataset (e.g., “cocitation”)
- __init__(parameters)#
- load_dataset()#
Load Planetoid dataset.
- Returns:
- Dataset
The loaded Planetoid dataset.
- Raises:
- RuntimeError
If dataset loading fails.
- class TUDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad TU datasets.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
data_type: Type of the dataset (e.g., “graph_classification”)
- __init__(parameters)#
- load_dataset()#
Load TU dataset.
- Returns:
- Dataset
The loaded TU dataset.
- Raises:
- RuntimeError
If dataset loading fails.
- class USCountyDemosDatasetLoader(parameters)#
Bases:
AbstractLoaderLoad US County Demos dataset with configurable year and task variable.
- Parameters:
- parametersDictConfig
- Configuration parameters containing:
data_dir: Root directory for data
data_name: Name of the dataset
year: Year of the dataset (if applicable)
task_variable: Task variable for the dataset
- __init__(parameters)#
- load_dataset()#
Load the US County Demos dataset.
- Returns:
- USCountyDemosDataset
The loaded US County Demos dataset with the appropriate data_dir.
- Raises:
- RuntimeError
If dataset loading fails.
Subpackages#
- topobench.data.loaders.graph package
ADMEDatasetLoaderGraphUniverseDatasetLoaderHeterophilousGraphDatasetLoaderMantraSimplicialDatasetLoaderManualGraphDatasetLoaderMoleculeDatasetLoaderOGBGDatasetLoaderPlanetoidDatasetLoaderTUDatasetLoaderUSCountyDemosDatasetLoader- Submodules
- topobench.data.loaders.graph.adme_datasets module
- topobench.data.loaders.graph.graph_universe_loader module
- topobench.data.loaders.graph.hetero_datasets module
- topobench.data.loaders.graph.mantra_dataset module
- topobench.data.loaders.graph.manual_graph_dataset_loader module
- topobench.data.loaders.graph.molecule_datasets module
- topobench.data.loaders.graph.ogbg_datasets module
- topobench.data.loaders.graph.planetoid_datasets module
- topobench.data.loaders.graph.tu_datasets module
- topobench.data.loaders.graph.us_county_demos_dataset_loader module
- topobench.data.loaders.hypergraph package
- topobench.data.loaders.pointcloud package
- topobench.data.loaders.simplicial package
Submodules#
- topobench.data.loaders.base module
ABCAbstractLoaderDictConfigPathPath.cwd()Path.home()Path.absolute()Path.chmod()Path.exists()Path.expanduser()Path.glob()Path.group()Path.hardlink_to()Path.is_block_device()Path.is_char_device()Path.is_dir()Path.is_fifo()Path.is_file()Path.is_mount()Path.is_socket()Path.is_symlink()Path.iterdir()Path.lchmod()Path.link_to()Path.lstat()Path.mkdir()Path.open()Path.owner()Path.read_bytes()Path.read_text()Path.readlink()Path.rename()Path.replace()Path.resolve()Path.rglob()Path.rmdir()Path.samefile()Path.stat()Path.symlink_to()Path.touch()Path.unlink()Path.write_bytes()Path.write_text()
abstractmethod()