You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Connection between KL Divergence and Fisher Information Matrix
This section is to be intended as a guide to read this more effectively
The connection between KL Divergence / Relative Entropy of a pair of PDF and the Fisher Information Matrix of a PDF becomes clear when we focus on the convergence of the 2 PDFs
Let's take a pair of PDFs $p(\cdot), q(\cdot)$ and let's assume they are part of the same family $f(\cdot, \theta)$ so they differ only in terms of their parameterization $\theta_{0}, \theta_{1}$
Let's consider the case when the 2 PDFs are very similar so we can express this formally as follows: $\theta_{0} - \theta_{1} \rightarrow 0$
In this case, let's take $\theta_{0}$ as a reference so $\theta_{1} \rightarrow \theta_{0}$ then the let's change the formalism a little bit $D_{KL, \theta_{0}}(\theta)$
NOTE: we can't just express this as a function $\Delta \theta = |\theta_{0} - \theta_{1}|$ as the KL Divergence is not symmetric
So $D_{KL, \theta_{0}}(\theta)$ is in in general a non linear function of $\theta$ but as we are interested in the limit of $\lim_{\theta \rightarrow \theta_{0}}$ then we can linearize it with a series expansion
NOTE: In this limit the KL Divergence is expected to become more and more symmetric, which makes the arbitrary choice of the reference frame less and less relevant
It can be shown the 2nd term of the KL Divergence expansion in this limit is equal to the Fisher Information Matrix of $f(\cdot, \theta)$ PDF
As a result of this, we can interpret the Fisher Information Matrix as the Hessian or Curvature of the KL Divergence / Relative Entropy of the 2 PDFs in the limit when they are very close to each other
Connection between KL Divergence and Fisher Information Matrix
This section is to be intended as a guide to read this more effectively