This package shows how to multiply the inverse of the Hessian of a deep network
with a vector. If the Hessian-vector product is
Pearlmutter showed a
clever way to compute the Hessian-vector-product for a deep net. By contrast,
the paper and code in this repo shows how to compute the
Hessian-inverse-product, the product of the
inverse of the Hessian of a deep net with a vector.
Solving this system naively requires a number of operations that scales
cubically with the number of parameters in the deep net, which is impractical
for most modern networks. The trick is to augment the system of equations
The full idea is described in this paper. For a demo, see demo_hessian.ipynb. For a look at how the algortihm is implemented, see the hessian_inverse_product function.
The algorithm relies heavily on operations hierarchically nested, structured, block matrices. For example, it makes use of partitioned matrices whose blocks are block-diagonal, and tri-diagonal matrices whose blocks are partitiond matrices. The code includes a library to manipulate such matrices in block_partitioned_matrices.py. See its tutorial.