PyTorch API¶
geomloss
 Geometric Loss functions,
with full support of PyTorch’s autograd
engine:

Creates a criterion that computes distances between sampled measures on a vector space. 

class
geomloss.
SamplesLoss
(loss='sinkhorn', p=2, blur=0.05, reach=None, diameter=None, scaling=0.5, truncate=5, cost=None, kernel=None, cluster_scale=None, debias=True, potentials=False, verbose=False, backend='auto')[source]¶ Creates a criterion that computes distances between sampled measures on a vector space.
Warning
If loss is
"sinkhorn"
and reach is None (balanced Optimal Transport), the resulting routine will expect measures whose total masses are equal with each other. Parameters
loss (string, default =
"sinkhorn"
) –The loss function to compute. The supported values are:
"sinkhorn"
: (Unbiased) Sinkhorn divergence, which interpolates between Wasserstein (blur=0) and kernel (blur= \(+\infty\) ) distances."hausdorff"
: Weighted Hausdorff distance, which interpolates between the ICP loss (blur=0) and a kernel distance (blur= \(+\infty\) )."energy"
: Energy Distance MMD, computed using the kernel \(k(x,y) = \xy\_2\)."gaussian"
: Gaussian MMD, computed using the kernel \(k(x,y) = \exp \big( \xy\_2^2 \,/\, 2\sigma^2)\) of standard deviation \(\sigma\) = blur."laplacian"
: Laplacian MMD, computed using the kernel \(k(x,y) = \exp \big( \xy\_2 \,/\, \sigma)\) of standard deviation \(\sigma\) = blur.
p (int, default=2) –
If loss is
"sinkhorn"
or"hausdorff"
, specifies the ground cost function between points. The supported values are:p = 1: \(~~C(x,y) ~=~ \xy\_2\).
p = 2: \(~~C(x,y) ~=~ \tfrac{1}{2}\xy\_2^2\).
blur (float, default=.05) –
The finest level of detail that should be handled by the loss function  in order to prevent overfitting on the samples’ locations.
If loss is
"gaussian"
or"laplacian"
, it is the standard deviation \(\sigma\) of the convolution kernel.If loss is
"sinkhorn"
or"haudorff"
, it is the typical scale \(\sigma\) associated to the temperature \(\varepsilon = \sigma^p\). The default value of .05 is sensible for input measures that lie in the unit square/cube.
Note that the Energy Distance is scaleequivariant, and won’t be affected by this parameter.
reach (float, default=None= \(+\infty\)) – If loss is
"sinkhorn"
or"hausdorff"
, specifies the typical scale \(\tau\) associated to the constraint strength \(\rho = \tau^p\).diameter (float, default=None) – A rough indication of the maximum distance between points, which is used to tune the \(\varepsilon\)scaling descent and provide a default heuristic for clustering multiscale schemes. If None, a conservative estimate will be computed onthefly.
scaling (float, default=.5) – If loss is
"sinkhorn"
, specifies the ratio between successive values of \(\sigma=\varepsilon^{1/p}\) in the \(\varepsilon\)scaling descent. This parameter allows you to specify the tradeoff between speed (scaling < .4) and accuracy (scaling > .9).truncate (float, default=None= \(+\infty\)) – If backend is
"multiscale"
, specifies the effective support of a Gaussian/Laplacian kernel as a multiple of its standard deviation. If truncate is not None, kernel truncation steps will assume that \(\exp(x/\sigma)\) or \(\exp(x^2/2\sigma^2) are zero when :math:\)x ,>, text{truncate}cdot sigma`.cost (function or string, default=None) –
if loss is
"sinkhorn"
or"hausdorff"
, specifies the cost function that should be used instead of \(\tfrac{1}{p}\xy\^p\):If backend is
"tensorized"
, cost should be a python function that takes as input a (B,N,D) torch Tensor x, a (B,M,D) torch Tensor y and returns a batched Cost matrix as a (B,N,M) Tensor.Otherwise, if backend is
"online"
or"multiscale"
, cost should be a KeOps formula, given as a string, with variablesX
andY
. The default values are"Norm2(XY)"
(for p = 1) and"(SqDist(X,Y) / IntCst(2))"
(for p = 2).
cluster_scale (float, default=None) – If backend is
"multiscale"
, specifies the coarse scale at which cluster centroids will be computed. If None, a conservative estimate will be computed from diameter and the ambient space’s dimension, making sure that memory overflows won’t take place.debias (bool, default=True) – If loss is
"sinkhorn"
, specifies if we should compute the unbiased Sinkhorn divergence instead of the classic, entropyregularized “SoftAssign” loss.potentials (bool, default=False) – When this parameter is set to True, the
SamplesLoss
layer returns a pair of optimal dual potentials \(F\) and \(G\), sampled on the input measures, instead of differentiable scalar value. These dual vectors \((F(x_i))\) and \((G(y_j))\) are encoded as Torch tensors, with the same shape as the input weights \((\alpha_i)\) and \((\beta_j)\).verbose (bool, default=False) – If backend is
"multiscale"
, specifies whether information on the clustering and \(\varepsilon\)scaling descent should be displayed in the standard output.backend (string, default =
"auto"
) –The implementation that will be used in the background; this choice has a major impact on performance. The supported values are:
"auto"
: Choose automatically, using a simple heuristic based on the inputs’ shapes."tensorized"
: Relies on a full cost/kernel matrix, computed once and for all and stored on the device memory. This method is fast, but has a quadratic memory footprint and does not scale beyond ~5,000 samples per measure."online"
: Computes cost/kernel values onthefly, leveraging online mapreduce CUDA routines provided by the pykeops library."multiscale"
: Fast implementation that scales to millions of samples in dimension 123, relying on the blocksparse reductions provided by the pykeops library.