Pinsker's inequality for adapted total variation
Pinsker's classical inequality asserts that the total variation $TV(\mu, \nu)$ between two probability measures is bounded by $\sqrt{ 2H(\mu|\nu)}$ where $H$ denotes the relative entropy (or Kullback-Leibler divergence). Considering the discrete metric, $TV$ can be seen as a Wasserstein distance and as such possesses an adapted variant $ATV$. Adapted Wasserstein distances have distinct advantages over their classical counterparts when $\mu, \nu$ are the laws of stochastic processes $(X_k)_{k=1}^n, (Y_k)_{k=1}^n$ and exhibit numerous applications from stochastic control to machine learning. In this note we observe that the adapted total variation distance $ATV$ satisfies the Pinsker-type inequality $$ ATV(\mu, \nu)\leq \sqrt{n} \sqrt{2 H(\mu|\nu)}.$$