$c$-Concavity and Cyclical Monotonicity
In this article, we will introduce some results for $c$-concavity and cyclical monotonicity, which will be used to prove the duality result in the next article.
Proposition 1.3 Let $c:X \times Y \to \mathbb{R}$. For $\varphi:X \to \mathbb{R} \cup \{-\infty\}$, we have
$$
\varphi^{c\bar c} \ge \varphi.
$$
Moreover, the equality holds if and only if $\varphi$ is $c$-concave. In general, $\varphi^{c\bar c}$ is the smallest $c$-concave function that is larger than $\varphi$.
Proof : First, we show $\varphi^{c\bar c} \ge \varphi$. For any $x \in X$, $y \in Y$,
$$
\varphi^c(y)=\inf_{x\in X}[c(x,y)-\varphi(x)] \le c(x,y)-\varphi(x).
$$
Therefore,
$$
\varphi^{c\bar c}(x)=\inf_{y\in Y}[c(x,y)-\varphi^c(y)]
\ge \inf_{y\in Y}[c(x,y)-(c(x,y)-\varphi(x))]=\varphi(x).
$$
Then, let us prove $\varphi^{c\bar c}=\varphi$ if and only if $\varphi$ is $c$-concave. It is obvious that if $\varphi^{c\bar c}=\varphi$ then $\varphi=(\varphi^c)^{\bar c}$, which shows that $\varphi$ is $c$-concave. If $\varphi$ is $c$-concave, then there exists a function $\zeta:Y\to\mathbb{R}\cup\{-\infty\}$ such that
$$
\varphi=\zeta^{\bar c}.
$$
Hence
$$
\varphi^c=\zeta^{\bar c c}\ge \zeta,
$$
and then
$$
\varphi^{c\bar c}\le \zeta^{\bar c}=\varphi,
$$
which implies $\varphi^{c\bar c}=\varphi$.
Finally, if $\psi\ge \varphi$ and $\psi$ is $c$-concave, we need to show $\psi\ge \varphi^{c\bar c}$. Suppose $\psi=\eta^{\bar c}$, then
$$
\psi=\eta^{\bar c}\ge \varphi
\Rightarrow \eta^{\bar c c}\le \varphi^c
\Rightarrow \eta\le \varphi^c
\Rightarrow \psi=\eta^{\bar c}\ge \varphi^{c\bar c}.\quad \square
$$
Recall the sub-differential for convex functions
Definition 1.5 For every convex function $f:\mathbb{R}^d\to\mathbb{R}\cup\{+\infty\}$, we define its subdifferential at $x$ as the set
$$
\partial f(x):=\{p\in\mathbb{R}^d:\ f(y)\ge f(x)+p\cdot (y-x),\ \forall y\in\mathbb{R}^d\}.
$$
Proposition 1.4 For every convex function $f:\mathbb{R}^d\to\mathbb{R}$, we have
(1) $\partial f(x)\neq \emptyset$ if $x\in \operatorname{int}(\operatorname{dom} f)$.
(2) If $f$ is differentiable at $x$, then $\partial f(x)=\{\nabla f(x)\}$.
(3) $p\in \partial f(x)$ if and only if $x\in \partial f^\star(p)$ if and only if
$$
f(x)+f^\star(p)=x\cdot p.
$$
(4) If $p_1\in \partial f(x_1)$, $p_2\in \partial f(x_2)$, then
$$
(p_1-p_2)\cdot (x_1-x_2)\ge 0.
$$
Definition 1.6 Let us define the graph of the subdifferential of a convex function as
$$
\operatorname{Graph}(\partial f):=\{(x,p):\ p\in \partial f(x)\}
=\{(x,p):\ f(x)+f^\star(p)=x\cdot p\}.
$$
Definition 1.7 A set $A\subset \mathbb{R}^d\times \mathbb{R}^d$ is said to be cyclically monotone if for every $k\in \mathbb{N}$, every permutation $\sigma$ and every finite family of points $(x_1,p_1),\ldots,(x_k,p_k)\in A$, we have
$$
\sum_{i=1}^k x_i\cdot p_i \ge \sum_{i=1}^k x_i\cdot p_{\sigma(i)}.
$$
Note that if we take $k=2$, we get the usual definition of monotonicity
$$
x_1\cdot p_1+x_2\cdot p_2\ge x_1\cdot p_2+x_2\cdot p_1
\Leftrightarrow
(p_1-p_2)\cdot (x_1-x_2)\ge 0.
$$
By Rockafellar theorem, every cyclically monotone set is contained in the graph of the subdifferential of a convex function.
c-Cyclical Monotonicity
Definition 1.8 Once a function $c:X\times Y\to \mathbb{R}\cup\{+\infty\}$ is given, we say that a set $\Gamma\subset X\times Y$ is $c$-cyclically monotone (briefly $c$-CM) if for every $k\in\mathbb{N}$, every permutation $\sigma$ and every finite family of points $(x_1,y_1),\ldots,(x_k,y_k)\in \Gamma$, we have
$$
\sum_{i=1}^k c(x_i,y_i)\le \sum_{i=1}^k c(x_i,y_{\sigma(i)}).
$$
Remark : Take $c(x,y)=\frac12|x-y|^2$, we get $\Gamma$ is cyclically monotone.
Theorem 1.6 If $\Gamma\neq \emptyset$ is a $c$-CM set in $X\times Y$ and $c:X\times Y\to\mathbb{R}$ (note that $c$ is required not to take the value $+\infty$), then there exists a $c$-concave function $\varphi:X\to \mathbb{R}\cup\{-\infty\}$ (different from the constant $-\infty$ function) such that
$$
\Gamma\subset \{(x,y)\in X\times Y:\ \varphi(x)+\varphi^c(y)=c(x,y)\}.
$$
Proof : Let us fix a point $(x_0,y_0)\in \Gamma$. For $x\in X$ set
$$
\varphi(x)=\inf \Big\{
c(x,y_n)-c(x_n,y_n)+c(x_n,y_{n-1})-c(x_{n-1},y_{n-1})
+\cdots +c(x_1,y_0)-c(x_0,y_0): n\in \mathbb{N},\ (x_i,y_i)\in \Gamma \text{ for all } i=1,2,\ldots,n
\Big\}.
$$
Since $c$ is real valued and $\Gamma$ is nonempty, $\varphi$ never takes the value $+\infty$.
If we set for $y\in Y$
$$
-\psi(y)=\inf \Big\{
-c(x_n,y)+c(x_n,y_n)-c(x_{n-1},y_n)+c(x_{n-1},y_{n-1})-\cdots +c(x_1,y_0)-c(x_0,y_0): n\in \mathbb{N},\ (x_i,y_i)\in \Gamma \text{ for all } i=1,2,\ldots,n,\ \text{and } y_n=y
\Big\}.
$$
By definition
$$
\psi(y)>-\infty
\Leftrightarrow
y\in (\pi_Y)(\Gamma)=\{y\in Y:\ \exists x\in X \text{ s.t. } (x,y)\in \Gamma\}.
$$
Next, we prove that $\varphi=\psi^c$, which implies $\varphi$ is $c$-concave. Indeed, if $y\notin (\pi_Y)(\Gamma)$, $\psi(y)=-\infty$, then $c(x,y)-\psi(y)=+\infty$. Hence, we only need to consider $y\in (\pi_Y)(\Gamma)$. For $y\in (\pi_Y)(\Gamma)$,
$$
\begin{aligned}
c(x,y)-\psi(y)
&=
c(x,y)+\inf \Big\{
-c(x_n,y)+c(x_n,y_n)-c(x_{n-1},y_n)+\cdots +c(x_1,y_0)-c(x_0,y_0):(x_i,y_i)\in \Gamma \text{ for all } i\ge 1,\ldots,n,\ \text{and } y_n=y
\Big\}\\\
&=
\inf \Big\{
c(x,y_n)-c(x_n,y_n)+c(x_n,y_{n-1})-c(x_{n-1},y_{n-1})+\cdots +c(x_1,y_0)-c(x_0,y_0):(x_i,y_i)\in \Gamma \text{ for all } i=1,2,\ldots,n,\ \text{and } y_n=y
\Big\}.
\end{aligned}
$$
Take the infimum of $y=y_n$, we get
$$
\inf_y[c(x,y)-\psi(y)]=\varphi(x),
$$
that is
$$
\varphi=\psi^c.
$$
Then, we claim: $\varphi(x_0)\ge 0$ and hence $\varphi \not\equiv -\infty$. In fact, for all $n\in \mathbb{N}$ and $(x_i,y_i)\in \Gamma$, $i=1,2,\ldots,n$,
$$
c(x_0,y_n)-c(x_n,y_n)+c(x_n,y_{n-1})-c(x_{n-1},y_{n-1})+\cdots +c(x_1,y_0)-c(x_0,y_0)=
\sum_{i=0}^n c(x_{i+1},y_i)-\sum_{i=0}^n c(x_i,y_i)\ge 0.
$$
Hence $\varphi(x_0)\ge 0$.
Now, we need to show $\varphi(x)+\varphi^c(y)\ge c(x,y)$ for $(x,y)\in \Gamma$. Since $\varphi=\psi^c$ and we have $\varphi^c=\psi^{c\bar c}\ge \psi$, we only need to show
$$
\varphi(x)+\psi(y)\ge c(x,y)
\qquad \text{for } (x,y)\in \Gamma.
$$
Fix $(x,y)\in \Gamma$. Since
$$
\varphi(x)=\psi^c(x)=\inf_{y\in (\pi_Y)(\Gamma)}[c(x,y)-\psi(y)],
$$
for all $\varepsilon>0$, there exists $\bar y\in (\pi_Y)(\Gamma)$ such that
$$
c(x,\bar y)-\psi(\bar y)<\varphi(x)+\varepsilon. \tag{1}
$$
Claim :
$$
-\psi(y)\le -c(x,y)+c(x,\bar y)-\psi(\bar y).
$$
In fact, since $\bar y\in (\pi_Y)(\Gamma)$, there exists a chain
$$
(x_1,y_1),\ldots,(x_n,y_n)\in \Gamma,
\qquad \text{with } y_n=\bar y.
$$
Since $(x,y)\in \Gamma$, we can add it to the chain above, and hence
$$
-\psi(y)\le -c(x,y)+c(x,\bar y)-c(x_n,\bar y)+c(x_n,y_n)-c(x_{n-1},y_n)+c(x_{n-1},y_{n-1})+\cdots +c(x_1,y_0)-c(x_0,y_0).
$$
Take the infimum, we get
$$
-\psi(y)\le -c(x,y)+c(x,\bar y)-\psi(\bar y). \tag{2}
$$
Therefore, combined by (1) and (2), we have
$$
-\psi(y)<-c(x,y)+\varphi(x)+\varepsilon,
$$
by the arbitrarity of $\varepsilon>0$, we get
$$
\varphi(x)+\psi(y)\ge c(x,y).
$$
Since $\varphi^c\ge \psi$, which completes the prove. $\square$
Theorem 1.7 If $\gamma$ is an optimal transport plan for the cost $c$ and $c$ is continuous, then $\operatorname{Supp}(\gamma)$ is a $c$-CM set.
Proof : Suppose by contradiction that there exists $k$, $\sigma$ and $(x_1,y_1),\ldots,(x_k,y_k)\in \operatorname{Supp}(\gamma)$ such that
$$
\sum_{i=1}^k c(x_i,y_i)>\sum_{i=1}^k c(x_i,y_{\sigma(i)}).
$$
Take
$$
0<\varepsilon<\frac{1}{2k}\left(\sum_{i=1}^k c(x_i,y_i)-\sum_{i=1}^k c(x_i,y_{\sigma(i)})\right).
$$
By continuity of $c$, there exists $r>0$ such that for all $i=1,2,\ldots,k$, we have
$$
c(x,y)>c(x_i,y_i)-\varepsilon
\qquad \text{for all } (x,y)\in B(x_i,r)\times B(y_i,r),
$$
and
$$
c(x,y)<c(x_i,y_{\sigma(i)})+\varepsilon
\qquad \text{for all } (x,y)\in B(x_i,r)\times B(y_{\sigma(i)},r).
$$
Now, consider
$$
V_i:=B(x_i,r)\times B(y_i,r).
$$
Note that since $(x_i,y_i)\in \operatorname{supp}(\gamma)$, we have
$$
\gamma(V_i)>0
\qquad \text{for all } i=1,\ldots,k.
$$
Define measures
$$
\gamma_i:=\gamma \llcorner V_i/\gamma(V_i)
\quad \text{and} \quad
\mu_i=(\pi_x)_\sharp \gamma_i,\ \nu_i=(\pi_y)_\sharp \gamma_i.
$$
Take
$$
\varepsilon_0<\frac{1}{k}\min_i \gamma(V_i).
$$
For every $i$, build a measure $\widetilde\gamma_i\in \Pi(\mu_i,\nu_{\sigma(i)})$ and define
$$
\widetilde\gamma:=\gamma-\varepsilon_0\sum_{i=1}^k \gamma_i+\varepsilon_0\sum_{i=1}^k \widetilde\gamma_i.
$$
Claim : $\widetilde\gamma\in \Pi(\mu,\nu)$ and
$$
\int_{X\times Y} c\ d\widetilde\gamma<\int_{X\times Y} c\ d\gamma.
$$
First, we check $\widetilde\gamma$ is a positive measure. It is sufficient to prove that $\gamma-\varepsilon_0\sum_{i=1}^k \gamma_i$ is positive, and for that, the condition
$$
\varepsilon_0\gamma_i<\frac1k\gamma
$$
is enough.
Since
$$
\varepsilon_0\gamma_i=\frac{\varepsilon_0}{\gamma(V_i)}\ \gamma\llcorner V_i,
$$
and
$$
\frac{\varepsilon_0}{\gamma(V_i)}<\frac1k,
$$
we get the desired result.
Now, let us check the marginals of $\widetilde\gamma$. We have
$$
(\pi_x)_\sharp \widetilde \gamma = \mu-\varepsilon_0 \sum _{i=1} ^k \mu_i+\varepsilon_0\sum _{i=1} ^k \mu_i = \mu
$$
and
$$
(\pi_y)_\sharp \widetilde\gamma = \nu-\varepsilon _0 \sum _{i=1} ^k \nu_i + \varepsilon_0 \sum _{i=1} ^k \nu _{\sigma(i)} = \nu.
$$
Finally, let us estimate
$$
\int_{X\times Y} c\ d\gamma-\int_{X\times Y} c\ d\widetilde\gamma.
$$
We have
$$
\begin{aligned}
\int_{X\times Y} c\ d\gamma-\int_{X\times Y} c\ d\widetilde\gamma
&=
\varepsilon_0\sum_{i=1}^k \int_{X\times Y} c\ d\gamma_i
-\varepsilon_0\sum_{i=1}^k \int_{X\times Y} c\ d\widetilde\gamma_i\\\
&\ge
\varepsilon_0\sum_{i=1}^k (c(x_i,y_i)-\varepsilon)
-\varepsilon_0\sum_{i=1}^k (c(x_i,y_{\sigma(i)})+\varepsilon)\\\
&=
\varepsilon_0\left(
\sum_{i=1}^k c(x_i,y_i)-\sum_{i=1}^k c(x_i,y_{\sigma(i)})-2k\varepsilon
\right)>0,
\end{aligned}
$$
where we use the fact that $\gamma_i$ is concentrated on $B(x_i,r)\times B(y_i,r)$ and $\widetilde\gamma_i$ is concentrated on $B(x_i,r)\times B(y_{\sigma(i)},r)$. $\square$
Reference
Santambrogio, Filippo. Optimal transport for applied mathematicians. Birkäuser, NY 55.58-63 (2015): 94.
The cover image of this article was taken in Cromwell, New Zealand.
$c$-Concavity and Cyclical Monotonicity
