A roadway to understand the convergence of value functions
Why do we need the Banach fixed point theorem(BFPT) to prove convergence of value function. The corollary of BFPT ensures that the bellman backup equation for value functions has a unique solution and an arbitrary initialization and repeatitive application of the operator in the considered vector space will lead to a solution in the vector space.
Note that the bellman backup is primarily conditioned by the discounting factor, reward and transition dynamics to make it a contraction.
Let \(\mathcal{U}\) be a banach space and \(T : \mathcal{U}\rightarrow\mathcal{U}\) be a contraction mapping, then:
The proof will be split into several parts for the sake of development and understanding. We firstly begin with proving a part of statement 2.
Now first we try to develop that every sequence defined by an arbitrary start and a contraction operator on the banach space is convergent and cauchy.
\[\begin{equation}\lVert v^{n+m} - v^{m} \rVert \leq \lVert v^{n+m} - v^{n+m/2} \rVert + \lVert v^{n+m/2} - v^{m} \rVert\end{equation}\]Similarly, further breaking the distance between the points \(v^{n+m}\) and \(v^{m}\) into \(m\) smaller parts we have:
\[\begin{equation}\lVert v^{n+m} - v^{m} \rVert \leq \sum_{k=0}^{m-1} \lVert v^{n+k+1} - v^{n+k} \rVert\end{equation}\]but by definition of the sequence \(\{v^n\}\) that is derived from a contraction \(T\), we have:
\[\begin{equation} \begin{split} v^{n+k+1} &= T^{n+k}{v^{1}} \\ v^{n+k} &= T^{n+k}{v^{0}} \end{split} \end{equation}\]substituting \(3\) in \(2\)
\[\begin{equation} \begin{split} \lVert v^{n+m} - v^{m} \rVert &= \sum_{k=0}^{m-1} \lVert T^{n+k}{v^{1}} - T^{n+k}{v^{0}} \rVert \\ &\leq \sum_{k=0}^{m-1} \lambda^{n+k} \lVert v^{1} - v^{0} \rVert \\ &= \lVert v^{1} - v^{0} \rVert \sum_{k=0}^{m-1} \lambda^{n+k} \\ &= \frac{\lambda^{n}(1-\lambda^m)}{(1-\lambda)} \lVert v^{1} - v^{0} \rVert \end{split} \end{equation}\]as \(n\) and \(m\) become larger, the above upper bound of the contraction in \(4\) becomes smaller and smaller which makes every sequence \(\{v^n\}\) drawn from the banach space convergent (and cauchy).