Abstract
The $2$-Wasserstein distance is sensitive to minor geometric differencesbetween distributions, making it a very powerful dissimilarity metric. However,due to this sensitivity, a small outlier mass can also cause a significantincrease in the $2$-Wasserstein distance between two similar distributions.Similarly, sampling discrepancy can cause the empirical $2$-Wassersteindistance on $n$ samples in $\mathbb{R}^2$ to converge to the true distance at arate of $n^{-1/4}$, which is significantly slower than the rate of $n^{-1/2}$for $1$-Wasserstein distance. We introduce a new family of distances parameterized by $k \ge 0$, called$k$-RPW, that is based on computing the partial $2$-Wasserstein distance. Weshow that (1) $k$-RPW satisfies the metric properties, (2) $k$-RPW is robust tosmall outlier mass while retaining the sensitivity of $2$-Wasserstein distanceto minor geometric differences, and (3) when $k$ is a constant, $k$-RPWdistance between empirical distributions on $n$ samples in $\mathbb{R}^2$converges to the true distance at a rate of $n^{-1/3}$, which is faster thanthe convergence rate of $n^{-1/4}$ for the $2$-Wasserstein distance. Using the partial $p$-Wasserstein distance, we extend our distance to any $p\in [1,\infty]$. By setting parameters $k$ or $p$ appropriately, we can reduceour distance to the total variation, $p$-Wasserstein, and the L\'evy-Prokhorovdistances. Experiments show that our distance function achieves higher accuracyin comparison to the $1$-Wasserstein, $2$-Wasserstein, and TV distances forimage retrieval tasks on noisy real-world data sets.