![]() |
|
![]() |
| I know they mainly present results on deep learning/neural network training and optimization, but I wonder how easy it would be to use the same optimization framework for other classes of hard or large optimization problems. I was also curious about this when I saw posts about Extropic (https://www.extropic.ai/) stuff for the first time.
I tried looking into any public info on their website about APIs or software stack to see what's possible beyond NN stuff to model other optimization problems. It looks like that's not shared publicly yet. There are certainly many NP-hard and large combinatorial or analytical optimization problems still out there that are worth being able to tackle with new technology. Personally, I care about problems in EDA and semiconductor design. Adiabatic quantum computing was one technology with the promise of solving optimization problems (and quantum computing is still playing out with only small-scale solutions at the moment). Hoping that these new "thermodynamic computing" startups also might provide some cool technology to explore these problems with. |
![]() |
| Leveraging thermodynamics to more efficiently compute second-order updates is certainly cool and worth exploring, however specifically in the context of deep learning I remain skeptical of its usefulness.
We already have very efficient second-order methods running on classical hardware [1] but they are basically not being used at all in practice, as they are outperformed by ADAM and other 1st-order methods. This is because optimizing highly nonlinear loss functions, such as the ones in deep learning models, only really works with very low learning rates, regardless of whether a 1st or a 2nd order method is used. So, comparatively speaking, a 2nd order method might give you a slightly better parameter update per step but at a more-than-slightly-higher cost, so most of the time it's simply not worth doing. [1] https://andrew.gibiansky.com/blog/machine-learning/hessian-f... |
![]() |
| Sounds great until
> requires an analog thermodynamic computer Wait. What? Perhaps a trained physicist can comment on that. Thanks. |
![]() |
| The whole point is to leverage the laws of nature to train AI models, overcoming the limitations and scaling challenges of digital hardware and existing training methods. |
![]() |
| I didn't realize they included details about the hardware. Lie you said these just look like analog computers, compute in memory, analog arrays, which have also made a resurgence with deep leaning. |
![]() |
| I believe one example would be quantum annealers. Where "programming" involves setting the right initial conditions and allowing thermodynamics to bring you to an optimum via relaxation. |
![]() |
| Analog computers have a lot of history. You can Google analog with neural network or differential equations to get many results. They are fast with low power, can have precision issues, and require custom, chip design.
https://en.m.wikipedia.org/wiki/Analog_computer Mixed signal ASIC’s often use a mix of digital and analog blocks to get the benefits of analog. It’s especially helpful for anything that eats lots of power or to prevent that (eg mobile). |
∇̃L(θ) = F⁻¹∇L(θ)
which requires solving a linear system. For this, you can use the methods from the author's previous paper [Thermodynamic Linear Algebra](https://arxiv.org/abs/2308.05660).
Since it's hard to implement a full neural network on a thermodynamic computer, the paper suggests running one in parallel to a normal GPU. The GPU computes F and ∇L(θ), but offloads the linear system to the thermo computer, which runs in parallel to the digital system (Figure 1).
It is important to note that the "Runtime vs Accuracy" plot in Figure 3 uses a "timing model" for the TNGD algorithm, since the computer necessary to run the algorithm still doesn't exist.