topobench.transforms.data_manipulations.group_homophily module#
A transform that canculates group combinatorial homophily of the input hypergraph.
- class GroupCombinatorialHomophily(**kwargs)#
Bases:
BaseTransformCalculates group combinatorial homophily of the input hypergraph.
This transformation implements the methodology from the paper: “Combinatorial Characterizations and Impossibilities for Higher-order Homophily”. It computes homophily metrics for hypergraphs by analyzing the relationship between node labels within hyperedges.
- Parameters:
- **kwargsdict, optional
Additional parameters for the transform. - top_k : int, default=3
Number of top hyperedge cardinalities to analyze.
- Attributes:
- typestr
Identifier for the transform type.
- top_kint
Number of top hyperedge cardinalities to analyze.
- __init__(**kwargs)#
- calculate_D_matrix(H, labels, he_cardinalities, unique_labels, class_node_idxs)#
Calculate the degree matrices D and D_t for the hypergraph.
- Parameters:
- Htorch.Tensor
Dense incidence matrix of the hypergraph.
- labelstorch.Tensor
Node labels.
- he_cardinalitiestorch.Tensor
Cardinality of each hyperedge.
- unique_labelsdict
Dictionary mapping labels to their counts.
- class_node_idxsdict
Dictionary mapping labels to node indices.
- Returns:
- tuple[torch.Tensor, torch.Tensor]
D_t_class : Type-t degree distribution matrix for each class
D : Degree matrix counting same-label nodes in hyperedges
- calculate_affinity_score(n_nodes, X_mod, t, k)#
Calculate affinity score.
- Parameters:
- n_nodesint
Total number of nodes.
- X_modint
Total number of nodes in a class.
- tint
Type-t degree.
- kint
Max hyperedge cardinality.
- Returns:
- torch.Tensor
The affinity matrix.
- calculate_baseline_matrix(he_cardinalities, unique_labels, class_node_idxs, count_labels, n_nodes)#
Calculate the baseline affinity matrix for comparison.
- Parameters:
- he_cardinalitiestorch.Tensor
Cardinality of each hyperedge.
- unique_labelsdict
Dictionary mapping labels to their counts.
- class_node_idxsdict
Dictionary mapping labels to node indices.
- count_labelstorch.Tensor
Count of nodes for each label.
- n_nodesint
Total number of nodes in the hypergraph.
- Returns:
- torch.Tensor
Baseline matrix containing expected affinity scores for each class and degree type.
- forward(data)#
Apply the transform to the input data.
- Parameters:
- datatorch_geometric.data.Data
The input data.
- Returns:
- torch_geometric.data.Data
The transformed data.
- comb(N, k, *, exact=False, repetition=False)#
The number of combinations of N things taken k at a time.
This is often expressed as “N choose k”.
- Parameters:
- Nint, ndarray
Number of things.
- kint, ndarray
Number of elements taken.
- exactbool, optional
For integers, if exact is False, then floating point precision is used, otherwise the result is computed exactly.
- repetitionbool, optional
If repetition is True, then the number of combinations with repetition is computed.
- Returns:
- valint, float, ndarray
The total number of combinations.
See also
binomBinomial coefficient considered as a function of two real variables.
Notes
Array arguments accepted only for exact=False case.
If N < 0, or k < 0, then 0 is returned.
If k > N and repetition=False, then 0 is returned.
Examples
>>> import numpy as np >>> from scipy.special import comb >>> k = np.array([3, 4]) >>> n = np.array([10, 10]) >>> comb(n, k, exact=False) array([ 120., 210.]) >>> comb(10, 3, exact=True) 120 >>> comb(10, 3, exact=True, repetition=True) 220