synthval.metrics.PRScores#
- class synthval.metrics.PRScores(row_batch_size=25000, col_batch_size=50000, num_nearest_n=3)#
Bases:
SimilarityMetricA Similarity Metric class that computes the Precision and Recall scores between two distributions (real_dist and synth_dist).
- Parameters:
row_batch_size (int)
col_batch_size (int)
num_nearest_n (int)
- row_batch_size#
Size of the row batches used when computing pairwise distances. This provides a trade-off between memory usage and performance (default: 25000).
- Type:
int, optional
- col_batch_size#
Size of the column batches used for computing pairwise distances (default: 50000).
- Type:
int, optional
- num_nearest_n#
Number of nearest neighbors used to estimate the manifold. The manifold is used for computing precision and recall (default: 2).
- Type:
int, optional
References
Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, Timo Aila - Improved Precision and Recall Metric for Assessing Generative Models - Annual Conference on Neural Information Processing Systems, 2019.
- calculate(real_dist_df, synth_dist_df)#
Compute Precision and Recall metrics between two distributions.
- Parameters:
real_dist_df (pandas.DataFrame) – DataFrame containing samples from distribution real_dist.
synth_dist_df (pandas.DataFrame) – DataFrame containing samples from distribution synth_dist.
- Returns:
A numpy array containing the Precision and the Recall metrics between distribution real_dist and synth_dist.
- Return type:
numpy.ndarray
- static __compute_distances(row_features, col_features, col_batch_size)#
Compute pairwise distances between row and column features using batches to optimize memory usage.
- Parameters:
row_features (torch.Tensor) – Tensor of feature vectors representing the rows.
col_features (torch.Tensor) – Tensor of feature vectors representing the columns.
col_batch_size (int) – Size of the column batches for computing distances.
- Returns:
dist_batches – Tensor of pairwise distances.
- Return type:
torch.Tensor