Reflecting to optimise

原始链接: https://magnusross.github.io/posts/reflecting-to-optimise/

This is nothing to be proud of, but I have never really studied optimisation in depth. Oh sure, I know my Adam from my AdaGrad and I even used L-BFGS one time, but when people start talking about dual spaces and convergence for $L^\infty$

Well, today on the blog, I’d like to talk a bit about my optimisation blind spot in the context of an interesting problem I have been working on related to protein binder design. The first way you think to do something is rarely the best; here we’re going to discuss a concrete example of that. If you, like me, are not an optimisation expert, then I hope you’ll learn something useful from this post. If you are, then feel free to have a good old laugh at my ignorance!

Setup

Ok so what’s the setup? Let’s say we have a categorical probability distribution with $k$

the probabilities must be normalised: $\sum^{k}_{i=1} x_i = 1$
and, the probabilities must be greater than 0: $x_i \geq 0 \ \ \forall i \in \{1, \cdots, k\}$

We have a non-convex function $f$

Protein related aside

This is a simplified version of the problem of hallucination for de-novo binder design where $\mathbf{x}$

A first attempt

When seeing a problem like this, where we need to optimise something with constraints, my first instinct is to try and re-parameterise the problem so that I don’t have to worry about them, and we can just use all the “normal” methods. We basically want to rewrite $\mathbf{x}$

\mathbf{x}_i = \text{softmax}(\mathbf{\ell})_i = \frac{e^{\ell_i}}{\sum^{k}_{j=1} e^{\ell_j}}