Reflecting to optimise

原始链接: https://magnusross.github.io/posts/reflecting-to-optimise/

相关文章

原文

This is nothing to be proud of, but I have never really studied optimisation in depth. Oh sure, I know my Adam from my AdaGrad and I even used L-BFGS one time, but when people start talking about dual spaces and convergence for LL^\infty

Well, today on the blog, I’d like to talk a bit about my optimisation blind spot in the context of an interesting problem I have been working on related to protein binder design. The first way you think to do something is rarely the best; here we’re going to discuss a concrete example of that. If you, like me, are not an optimisation expert, then I hope you’ll learn something useful from this post. If you are, then feel free to have a good old laugh at my ignorance!

Setup

Ok so what’s the setup? Let’s say we have a categorical probability distribution with kk categories where the probability of each category is given by the vector xRk\mathbf{x}\in\mathbb{R}^k

  1. the probabilities must be normalised: i=1kxi=1\sum^{k}_{i=1} x_i = 1
  2. and, the probabilities must be greater than 0: xi0  i{1,,k}x_i \geq 0 \ \ \forall i \in \{1, \cdots, k\}

We have a non-convex function ff which takes in our probability vector, and spits out a real number. We want to find the x\mathbf{x} that minimises this function x=argminxΔf(x)\mathbf{x}^{*} = \text{argmin}_{\mathbf{x}\in\Delta}f(\mathbf{x})

Protein related aside

This is a simplified version of the problem of hallucination for de-novo binder design where x\mathbf{x} represents a distribution over k=20k=20

A first attempt

When seeing a problem like this, where we need to optimise something with constraints, my first instinct is to try and re-parameterise the problem so that I don’t have to worry about them, and we can just use all the “normal” methods. We basically want to rewrite x\mathbf{x} as a function of some other parameters with no constraints, then we can just optimise those. In this case, we can write

xi=softmax()i=eij=1kej \mathbf{x}_i = \text{softmax}(\mathbf{\ell})_i = \frac{e^{\ell_i}}{\sum^{k}_{j=1} e^{\ell_j}}
联系我们 contact @ memedata.com