synthval.metrics.KLDivergenceEstimation#
- class synthval.metrics.KLDivergenceEstimation(drop_duplicates=True)#
Bases:
SimilarityMetricSimilarity Metric computing an estimation of the Kullback-Leibler divergence based on the methodology proposed in the referenced paper. It should be noted that the algorithm used may cause a division-by-zero error if duplicates are present in the distributions considered.
- Parameters:
drop_duplicates (bool)
- drop_duplicates#
Flag controlling if the duplicates in the distribution can be dropped automatically (default: True).
- Type:
bool, Optional
References
Pérez-Cruz, F. - Kullback-Leibler divergence estimation of continuous distributions - IEEE International Symposium on Information Theory, 2008.
- calculate(real_dist_df, synth_dist_df)#
Compute an estimation of the Kullback-Leibler divergence between two set of samples originating from two multivariate distribution real_dist and synth_dist.
- Parameters:
real_dist_df (pandas.DataFrame) – Set of samples representing distribution real_dist.
synth_dist_df (pandas.DataFrame) – Set of samples representing distribution synth_dist.
- Returns:
A numpy array containing the estimated value of the Kullback-Leibler divergence.
- Return type:
numpy.ndarray