synthval.metrics.PRScores#

class synthval.metrics.PRScores(row_batch_size=25000, col_batch_size=50000, num_nearest_n=3)#

Bases: SimilarityMetric

A Similarity Metric class that computes the Precision and Recall scores between two distributions (real_dist and synth_dist).

Parameters:
  • row_batch_size (int)

  • col_batch_size (int)

  • num_nearest_n (int)

row_batch_size#

Size of the row batches used when computing pairwise distances. This provides a trade-off between memory usage and performance (default: 25000).

Type:

int, optional

col_batch_size#

Size of the column batches used for computing pairwise distances (default: 50000).

Type:

int, optional

num_nearest_n#

Number of nearest neighbors used to estimate the manifold. The manifold is used for computing precision and recall (default: 2).

Type:

int, optional

References

Tuomas Kynkäänniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, Timo Aila - Improved Precision and Recall Metric for Assessing Generative Models - Annual Conference on Neural Information Processing Systems, 2019.

calculate(real_dist_df, synth_dist_df)#

Compute Precision and Recall metrics between two distributions.

Parameters:
  • real_dist_df (pandas.DataFrame) – DataFrame containing samples from distribution real_dist.

  • synth_dist_df (pandas.DataFrame) – DataFrame containing samples from distribution synth_dist.

Returns:

A numpy array containing the Precision and the Recall metrics between distribution real_dist and synth_dist.

Return type:

numpy.ndarray

static __compute_distances(row_features, col_features, col_batch_size)#

Compute pairwise distances between row and column features using batches to optimize memory usage.

Parameters:
  • row_features (torch.Tensor) – Tensor of feature vectors representing the rows.

  • col_features (torch.Tensor) – Tensor of feature vectors representing the columns.

  • col_batch_size (int) – Size of the column batches for computing distances.

Returns:

dist_batches – Tensor of pairwise distances.

Return type:

torch.Tensor