A roadway to understand the convergence of value functions

The Why?

Why do we need the Banach fixed point theorem(BFPT) to prove convergence of value function. The corollary of BFPT ensures that the bellman backup equation for value functions has a unique solution and an arbitrary initialization and repeatitive application of the operator in the considered vector space will lead to a solution in the vector space.

Note that the bellman backup is primarily conditioned by the discounting factor, reward and transition dynamics to make it a contraction.

Let \(\mathcal{U}\) be a banach space and \(T : \mathcal{U}\rightarrow\mathcal{U}\) be a contraction mapping, then:

\(\exists\) a unique \(v^* \in \mathcal{U}\) such that \(T v^* = v^*\)

for arbitrary \(v^0 \in \mathcal{U}\), the sequence \(\{v^k\}\) defined by \(v^{n+1} = Tv^{n} = T^{n+1}v^{0}\) converges to \(v^*\)

Proof:

The proof will be split into several parts for the sake of development and understanding. We firstly begin with proving a part of statement 2.

Proof of statement 2

Now first we try to develop that every sequence defined by an arbitrary start and a contraction operator on the banach space is convergent and cauchy.

as \(n\) and \(m\) become larger, the above upper bound of the contraction in \(4\) becomes smaller and smaller which makes every sequence \(\{v^n\}\) drawn from the banach space convergent (and cauchy).