topobench.data.datasets package#
Dataset module with automated exports.
- class CitationHypergraphDataset(root, name, parameters)#
Bases:
InMemoryDatasetDataset class for US County Demographics dataset.
- Parameters:
- rootstr
Root directory where the dataset will be saved.
- namestr
Name of the dataset.
- parametersDictConfig
Configuration parameters for the dataset.
- Attributes:
- URLS (dict): Dictionary containing the URLs for downloading the dataset.
- FILE_FORMAT (dict): Dictionary containing the file formats for the dataset.
- RAW_FILE_NAMES (dict): Dictionary containing the raw file names for the dataset.
- __init__(root, name, parameters)#
- download()#
Download the dataset from a URL and saves it to the raw directory.
- Raises:
FileNotFoundError – If the dataset URL is not found.
- process()#
Handle the data for the dataset.
This method loads the US county demographics data, applies any pre- processing transformations if specified, and saves the processed data to the appropriate location.
- FILE_FORMAT: ClassVar = {'coauthorship_cora': 'zip', 'coauthorship_dblp': 'zip', 'cocitation_citeseer': 'zip', 'cocitation_cora': 'zip', 'cocitation_pubmed': 'zip'}#
- URLS: ClassVar = {'coauthorship_cora': 'https://drive.google.com/file/d/1J5fLPABWrM9SH_7m85n7--oHDVmwJeib/view?usp=sharing', 'coauthorship_dblp': 'https://drive.google.com/file/d/16ryf4Ve-t0_nAla0VfjtSxSAG8Sye8TZ/view?usp=sharing', 'cocitation_citeseer': 'https://drive.google.com/file/d/1XWfu1jtijsmHmfCP6UQxyLsuPM8GBNJb/view?usp=sharing', 'cocitation_cora': 'https://drive.google.com/file/d/1WVRx5yDxSdZpvL6FK5Ji8H3lOnyYlraN/view?usp=sharing', 'cocitation_pubmed': 'https://drive.google.com/file/d/1XbqDJnHnV0HYvie3fcM8rquamnQsLTpK/view?usp=sharing'}#
- property processed_dir: str#
Return the path to the processed directory of the dataset.
- Returns:
- str
Path to the processed directory.
- property processed_file_names: str#
Return the processed file name for the dataset.
- Returns:
- str
Processed file name.
- class HypergraphDataset(root, name, parameters)#
Bases:
InMemoryDatasetDataset class for Hypergaph dataset.
- Parameters:
- rootstr
Root directory where the dataset will be saved.
- namestr
Name of the dataset.
- parametersDictConfig
Configuration parameters for the dataset.
- Attributes:
- URLS (dict): Dictionary containing the URLs for downloading the dataset.
- FILE_FORMAT (dict): Dictionary containing the file formats for the dataset.
- RAW_FILE_NAMES (dict): Dictionary containing the raw file names for the dataset.
- __init__(root, name, parameters)#
- download()#
Download the dataset from a URL and saves it to the raw directory.
- Raises:
FileNotFoundError – If the dataset URL is not found.
- process()#
Handle the data for the dataset.
This method loads the US county demographics data, applies any pre- processing transformations if specified, and saves the processed data to the appropriate location.
- FILE_FORMAT: ClassVar = {'20newsW100': 'zip', 'ModelNet40': 'zip', 'Mushroom': 'zip', 'NTU2012': 'zip', 'zoo': 'zip'}#
- URLS: ClassVar = {'20newsW100': 'https://drive.google.com/file/d/1D1NtyS4g9LZJPlnxOOySGlRR2km1wGMm/view?usp=drive_link', 'ModelNet40': 'https://drive.google.com/file/d/1u3-SFCjOIh1G0U8pVclfGIlDCceJ0qxr/view?usp=drive_link', 'Mushroom': 'https://drive.google.com/file/d/1iad2l9w58UJvMMXOz6PtrbZkvGyFjWK6/view?usp=drive_link', 'NTU2012': 'https://drive.google.com/file/d/1g9P-uEVSATg6B_JRnyey78YbliIfst3Z/view?usp=drive_link', 'zoo': 'https://drive.google.com/file/d/18TuuGv3qiBfU-wqB3USB3HiiI9G-8X71/view?usp=drive_link'}#
- property processed_dir: str#
Return the path to the processed directory of the dataset.
- Returns:
- str
Path to the processed directory.
- property processed_file_names: str#
Return the processed file name for the dataset.
- Returns:
- str
Processed file name.
- class MantraDataset(root, name, parameters, **kwargs)#
Bases:
InMemoryDatasetDataset class for MANTRA manifold dataset.
- Parameters:
- rootstr
Root directory where the dataset will be saved.
- namestr
Name of the dataset.
- parametersDictConfig
Configuration parameters for the dataset.
- **kwargsdict
Additional keyword arguments.
- Attributes:
- URLS (dict): Dictionary containing the URLs for downloading the dataset.
- FILE_FORMAT (dict): Dictionary containing the file formats for the dataset.
- RAW_FILE_NAMES (dict): Dictionary containing the raw file names for the dataset.
- __init__(root, name, parameters, **kwargs)#
- download()#
Download the dataset from a URL and saves it to the raw directory.
- Raises:
FileNotFoundError – If the dataset URL is not found.
- process()#
Handle the data for the dataset.
This method loads the JSON file for MANTRA for the specified manifold dimmension, applies the respective preprocessing if specified and saves the preprocessed data to the appropriate location.
- URLS: ClassVar = {'2_manifolds': 'https://github.com/aidos-lab/mantra/releases/download/{version}/2_manifolds.json.gz', '3_manifolds': 'https://github.com/aidos-lab/mantra/releases/download/{version}/3_manifolds.json.gz'}#
- property processed_dir: str#
Return the path to the processed directory of the dataset.
- Returns:
- str
Path to the processed directory.
- property processed_file_names: str#
Return the processed file name for the dataset.
- Returns:
- str
Processed file name.
- class USCountyDemosDataset(root, name, parameters)#
Bases:
InMemoryDatasetDataset class for US County Demographics dataset.
- Parameters:
- rootstr
Root directory where the dataset will be saved.
- namestr
Name of the dataset.
- parametersDictConfig
Configuration parameters for the dataset.
- Attributes:
- URLS (dict): Dictionary containing the URLs for downloading the dataset.
- FILE_FORMAT (dict): Dictionary containing the file formats for the dataset.
- RAW_FILE_NAMES (dict): Dictionary containing the raw file names for the dataset.
- __init__(root, name, parameters)#
- download()#
Download the dataset from a URL and saves it to the raw directory.
- Raises:
FileNotFoundError – If the dataset URL is not found.
- process()#
Handle the data for the dataset.
This method loads the US county demographics data, applies any pre- processing transformations if specified, and saves the processed data to the appropriate location.
- URLS: ClassVar = {'US-county-demos': 'https://drive.google.com/file/d/1FNF_LbByhYNICPNdT6tMaJI9FxuSvvLK/view?usp=sharing'}#
- property processed_dir: str#
Return the path to the processed directory of the dataset.
- Returns:
- str
Path to the processed directory.
- property processed_file_names: str#
Return the processed file name for the dataset.
- Returns:
- str
Processed file name.
Submodules#
- topobench.data.datasets.citation_hypergraph_dataset module
CitationHypergraphDatasetCitationHypergraphDataset.__init__()CitationHypergraphDataset.download()CitationHypergraphDataset.process()CitationHypergraphDataset.FILE_FORMATCitationHypergraphDataset.RAW_FILE_NAMESCitationHypergraphDataset.URLSCitationHypergraphDataset.processed_dirCitationHypergraphDataset.processed_file_namesCitationHypergraphDataset.raw_dirCitationHypergraphDataset.raw_file_names
DataData.__init__()Data.connected_components()Data.debug()Data.edge_subgraph()Data.from_dict()Data.get_all_edge_attrs()Data.get_all_tensor_attrs()Data.is_edge_attr()Data.is_node_attr()Data.stores_as()Data.subgraph()Data.to_dict()Data.to_heterogeneous()Data.to_namedtuple()Data.update()Data.validate()Data.batchData.edge_attrData.edge_indexData.edge_storesData.edge_weightData.faceData.node_storesData.num_edge_featuresData.num_edge_typesData.num_facesData.num_featuresData.num_node_featuresData.num_node_typesData.num_nodesData.posData.storesData.timeData.xData.y
DictConfigInMemoryDatasetInMemoryDataset.__init__()InMemoryDataset.collate()InMemoryDataset.copy()InMemoryDataset.cpu()InMemoryDataset.cuda()InMemoryDataset.get()InMemoryDataset.len()InMemoryDataset.load()InMemoryDataset.save()InMemoryDataset.to()InMemoryDataset.to_on_disk_dataset()InMemoryDataset.dataInMemoryDataset.num_classesInMemoryDataset.processed_file_namesInMemoryDataset.raw_file_names
download_file_from_drive()extract_zip()load_hypergraph_pickle_dataset()
- topobench.data.datasets.hypergraph_datasets module
DataData.__init__()Data.connected_components()Data.debug()Data.edge_subgraph()Data.from_dict()Data.get_all_edge_attrs()Data.get_all_tensor_attrs()Data.is_edge_attr()Data.is_node_attr()Data.stores_as()Data.subgraph()Data.to_dict()Data.to_heterogeneous()Data.to_namedtuple()Data.update()Data.validate()Data.batchData.edge_attrData.edge_indexData.edge_storesData.edge_weightData.faceData.node_storesData.num_edge_featuresData.num_edge_typesData.num_facesData.num_featuresData.num_node_featuresData.num_node_typesData.num_nodesData.posData.storesData.timeData.xData.y
DictConfigHypergraphDatasetHypergraphDataset.__init__()HypergraphDataset.download()HypergraphDataset.process()HypergraphDataset.FILE_FORMATHypergraphDataset.RAW_FILE_NAMESHypergraphDataset.URLSHypergraphDataset.processed_dirHypergraphDataset.processed_file_namesHypergraphDataset.raw_dirHypergraphDataset.raw_file_names
InMemoryDatasetInMemoryDataset.__init__()InMemoryDataset.collate()InMemoryDataset.copy()InMemoryDataset.cpu()InMemoryDataset.cuda()InMemoryDataset.get()InMemoryDataset.len()InMemoryDataset.load()InMemoryDataset.save()InMemoryDataset.to()InMemoryDataset.to_on_disk_dataset()InMemoryDataset.dataInMemoryDataset.num_classesInMemoryDataset.processed_file_namesInMemoryDataset.raw_file_names
download_file_from_drive()extract_zip()load_hypergraph_content_dataset()
- topobench.data.datasets.mantra_dataset module
DataData.__init__()Data.connected_components()Data.debug()Data.edge_subgraph()Data.from_dict()Data.get_all_edge_attrs()Data.get_all_tensor_attrs()Data.is_edge_attr()Data.is_node_attr()Data.stores_as()Data.subgraph()Data.to_dict()Data.to_heterogeneous()Data.to_namedtuple()Data.update()Data.validate()Data.batchData.edge_attrData.edge_indexData.edge_storesData.edge_weightData.faceData.node_storesData.num_edge_featuresData.num_edge_typesData.num_facesData.num_featuresData.num_node_featuresData.num_node_typesData.num_nodesData.posData.storesData.timeData.xData.y
DictConfigInMemoryDatasetInMemoryDataset.__init__()InMemoryDataset.collate()InMemoryDataset.copy()InMemoryDataset.cpu()InMemoryDataset.cuda()InMemoryDataset.get()InMemoryDataset.len()InMemoryDataset.load()InMemoryDataset.save()InMemoryDataset.to()InMemoryDataset.to_on_disk_dataset()InMemoryDataset.dataInMemoryDataset.num_classesInMemoryDataset.processed_file_namesInMemoryDataset.raw_file_names
MantraDatasetdownload_file_from_link()extract_gz()read_ndim_manifolds()
- topobench.data.datasets.us_county_demos_dataset module
DataData.__init__()Data.connected_components()Data.debug()Data.edge_subgraph()Data.from_dict()Data.get_all_edge_attrs()Data.get_all_tensor_attrs()Data.is_edge_attr()Data.is_node_attr()Data.stores_as()Data.subgraph()Data.to_dict()Data.to_heterogeneous()Data.to_namedtuple()Data.update()Data.validate()Data.batchData.edge_attrData.edge_indexData.edge_storesData.edge_weightData.faceData.node_storesData.num_edge_featuresData.num_edge_typesData.num_facesData.num_featuresData.num_node_featuresData.num_node_typesData.num_nodesData.posData.storesData.timeData.xData.y
DictConfigInMemoryDatasetInMemoryDataset.__init__()InMemoryDataset.collate()InMemoryDataset.copy()InMemoryDataset.cpu()InMemoryDataset.cuda()InMemoryDataset.get()InMemoryDataset.len()InMemoryDataset.load()InMemoryDataset.save()InMemoryDataset.to()InMemoryDataset.to_on_disk_dataset()InMemoryDataset.dataInMemoryDataset.num_classesInMemoryDataset.processed_file_namesInMemoryDataset.raw_file_names
USCountyDemosDatasetUSCountyDemosDataset.__init__()USCountyDemosDataset.download()USCountyDemosDataset.process()USCountyDemosDataset.FILE_FORMATUSCountyDemosDataset.RAW_FILE_NAMESUSCountyDemosDataset.URLSUSCountyDemosDataset.processed_dirUSCountyDemosDataset.processed_file_namesUSCountyDemosDataset.raw_dirUSCountyDemosDataset.raw_file_names
download_file_from_drive()extract_zip()read_us_county_demos()