通过有限优化将 Pong 与音乐同步

通过有限优化将 Pong 与音乐同步
Synchronizing Pong to music with constrained optimization

原始链接: https://victortao.substack.com/p/song-pong

在本文中，作者讨论了基于经典游戏 Pong 创建音乐可视化工具。他们没有使用常规的乒乓球物理原理，而是修改了游戏动态，以允许与歌曲的节拍同步弹跳。目标是让“桨”随着节奏跳舞。作者建议操纵游戏的物理原理，使球以一致的速度移动，从而使球拍能够在屏幕的一半上自由移动。他们保留了某些传统的乒乓球规则，包括由球在球拍上的接触点决定的反射角度，以及球拍没有速度限制。此外，根据正常的乒乓球规则，球会从屏幕的上部和下部反弹。挑战在于确定在音乐的每个节拍期间球拍击球的最佳位置，以最佳地填充屏幕。一种方法是将两个球拍保持在屏幕中心附近，提供有限的水平空间，但由于球能够从顶部和底部边缘反弹，因此提供无限的垂直空间。调整球的水平速度可以控制每次击球的长度。尽管有一个可行的解决方案，但作者认为最终的显示可能会被认为是乏味的。他们认为，良好的可视化包括有效利用可用屏幕空间和动态运动。然而，编写算法来保证屏幕空间的正确使用并满足音乐时序是具有挑战性的，因为需要考虑对即将到来的动作的潜在影响。为了解决这个问题，作者建议将任务视为约束优化问题，利用现有方法寻找最优解决方案，而不是开发自定义算法。关键要素包括设定明确的目标（最大化屏幕使用）以及限制，例如遵守音乐节奏和游戏物理原理。使用这种方法，他们希望在创作中在兴奋感和功能性之间取得平衡。此外，作者承认实施二维约束可能很困难，建议仅关注水平分量作为合理的简化。由于球的垂直位置可以根据其水平运动来计算，因此解决问题的重点是在给定屏幕尺寸和球速度等特定约束的情况下为每个桨球交互找到理想的水平位置和速度。作者的结论是

您提供了有关当前项目的描述，其中涉及与音乐同步的几何图案。这是一个简短的解释：您的项目使用 React Native SVG 创建几何形状，例如玫瑰、螺旋、吸引子、曲线和和声图。您可以手动将这些几何图形与使用 Suno AI 创建的音乐曲目同步。虽然仍处于实验阶段，您的目标是在现场表演期间创建实时可视化。您计划利用 Spleeter 和 Librosa 等 Python 库来提高自动化程度。但是，代码和生成的应用程序当前不可用。此外，您提到使用 TypeScript 开发 Maurer Rose 函数，该函数根据三角方程和比例生成路径。这种方法可能不适合长期项目，因为它依赖于 setInterval 进行更新。最终，您的目标是发布一个独立的应用程序，用户可以在其中交互地修改各种几何图案的参数。另一方面，您发现《Suno》的作者陶涛也在同一家公司工作。

原文

Lately I’ve been experimenting with music visualizers. One of my favorites is inspired by the classic arcade game pong. In classic pong a ball bounces off of paddles in a steady rhythm. What if we synchronize the bounces to the beat of a song, making the paddles dance?

To make this possible we alter the physics of the game so that the ball moves at a constant speed, and paddles can move anywhere on their respective halves of the screen.

We also retain these rules from the classic game:

The contact point of the ball on the paddle determines its angle of reflection
Paddles have no speed limit
The ball bounces off the top and bottom of the screen

These physics give us the necessary degrees of freedom to move the paddles to hit the ball at any desired times.

One simple strategy to satisfy any timing requirements is to keep both paddles close to the center. This gives us little horizontal space, but vertical space is effectively infinite since the ball can bounce off the top and bottom edges of the screen. For any desired shot duration, we can slow the horizontal velocity to match it by hitting the ball more vertically. While this argument shows a solution exists for any input, it certainly isn't exciting to watch.

What makes a visualization appealing, though? One desirable property is the utilization of screen space; a game stuck in a small region feels cramped and underwhelming. Closely related is dynamism of movement; spectators enjoy watching players make heroic efforts to hit a ball just within reach.

Unfortunately it's difficult to write an algorithm to maximize screen usage while guaranteeing a valid game. If a paddle hits the ball far from the screen's center, the ball may not have time to cross the screen for the next beat. Any movement of the paddles must account for the potential impact on future moves. This is the crux of the problem: while respecting the beat of the song and the physics of the game, where should the paddles hit the ball on each beat to maximize screen utilization?

Fortunately there is a mature field dedicated to optimizing an objective (screen utilization) with respect to variables (the locations of bounces) in the presence of constraints on those variables (physics and the beats of the song). If we write our requirements as a constrained optimization problem, we can use an off-the-shelf solver to compute optimal paddle positions instead of designing an algorithm ourselves.

Formulating the task as a constrained optimization problem has several advantages. If we change the physics we simply need to update our constraints, while an algorithm may need to be rewritten. In a similar vein, it lets us easily experiment with objectives. Developing appealing animations is an iterative process, which the declarative style of constrained optimization facilitates.

That sounds great, but modeling 2D constraints is hard; is our game truly two-dimensional? It turns out modeling only the horizontal component is sufficient! In our ruleset, the ball moves at a constant speed, so its vertical velocity is exactly determined by its horizontal velocity. By simulating the game we can compute the ball's vertical position at any time.

Knowing the ball's vertical position also gives us the vertical position of the paddles since it must match the ball's position to hit it (plus a small delta to achieve the desired angle). Between hits, we simply linearly interpolate positions to achieve smooth movement. Thus we only need our solver to compute the horizontal positions and velocities at the times when the paddles hit the ball; the rest follows from the physics of the game.

Let's start by defining the hardcoded parameters of the game:

\(\begin{align} &\text{Let } W \in \mathbb{R}^+ \text{ be the width of the screen} \\ &\text{Let } S \in \mathbb{R}^+ \text{ be the speed of the ball} \end{align}\)

Next we have the beat times we’d like to synchronize. Each t represents the time when the ball should hit a paddle. We obtain these times from MIDI files, though in the future I’d like to explore more automated ways of extracting them from audio.

\(\text{Let } T = {t_0, t_1, \ldots, t_{n}} \text{ be the time of each beat}\)

We take the differences between adjacent times to get the duration the ball should travel each shot. Each d represents the time interval between two consecutive hits of the ball.

\(\begin{align*} \text{Let } D &= {d_0, d_1, \ldots, d_{n-1}} \text{ be the duration of each shot}\\ d_i &= t_{i+1} - t_{i} \quad \text{for } i = 0,1, 2, \ldots, n-1 \\ \end{align*}\)

These are all the fixed inputs we provide to the solver. Now we’ll define the variables we’d like to optimize and their relationships to other quantities.

Remember we only need to optimize horizontal positions and velocities.

\(\text{Let } P = {p_0, p_1, \ldots, p_{n-1}} \text{ be the horizontal distance of a paddle from the center}\)

Each p_i represents the horizontal distance of the paddle from the center when it hits the ball for the i-th time. Even indices (p_0, p_2, …) represent the left paddle, while odd indices (p_1, p_3, …) represent the right paddle.

\(\text{Let } V = {v_0, v_1, \ldots, v_{n-1}} \text{ be the horizontal velocity of the ball}\)

The set of ball horizontal velocities. Each v_i represents the horizontal velocity of the ball after the i-th hit. To make it easier to construct constraints, we define these to always be positive no matter if the ball is moving left or right.

First we constrain paddles to stay on their respective halves of the screen:

\(0 \leq p_i \leq \frac{W}{2}\)

We defined p as the distance from the center so it’s nonnegative no matter which side of the screen it’s on.

Next we constrain the magnitude of the ball's horizontal velocity to be positive and not exceeding its total speed S.

These two constraints fully define the physics of our game. We add one final constraint to enforce beat synchronization:

\(p_{i-1} + p_i = d_i v_i\)

This ensures that the ball reaches the next paddle at the correct time. The left hand side represents the total horizontal distance traveled (sum of distances from center for two consecutive paddle hits), which must equal the product of the duration of the shot and the ball's horizontal velocity.

We know there is a degenerate solution in which the paddles are stuck in the center. A natural thought is to encourage the opposite; maximize the distance from the center.

Our objective, then, is to maximize the sum of the paddle’s distances from the center.

\(\text{Maximize} \sum_{i=1}^n p_i\)

With our mathematical formulation in hand, all that’s left is to solve it. All of our constraints are linear, so any linear programming (LP) solver will suffice. I chose CVXPY for its simplicity and generality. CVXPY solves convex optimization problems, of which LP problems are a subset. While we don’t require its full feature set for this task, the flexibility to support more complex objectives and constraints facilitates creative exploration.

With CVXPY, we can express the constraints enumerated above as follows:

Sure enough, the code closely matches the formulas. I actually find the code easier to follow than the formulas due to the descriptive variable names.

The solver returns the horizontal positions where the paddles should hit the ball as well as the horizontal velocities of the ball, from which we can easily compute the angles of reflection. This is enough to compute the vertical positions via simulation.

To animate the final result, we use the positions of the ball and paddles at the time of each hit as keyframes. Between hits we interpolate the positions to create a smooth animation.

Visualization is an extremely open-ended problem, and ultimately the only judges are our own eyes. More often than not the results depend on which algorithms we know. I suspect there are many more visualizers powered by constrained optimization waiting to be discovered; I’d love to hear about them!

The code is open source: Github Repo

Thanks to Pat O’Neill for reading drafts of this post.

通过有限优化将 Pong 与音乐同步 Synchronizing Pong to music with constrained optimization

通过有限优化将 Pong 与音乐同步
Synchronizing Pong to music with constrained optimization