Show HN:Localscope——限制Python函数作用域以实现可复现执行
Show HN: Localscope–Limit scope of Python functions for reproducible execution

原始链接: https://localscope.readthedocs.io/en/latest/

Jupyter Notebook 非常适合交互式数据分析和模型训练,但其交互性也可能导致一些细微的bug。一个常见的问题是在函数内部意外地使用了全局变量,而没有显式地将其声明为参数。这种“信息泄漏”会导致结果不可重现,从而使调试成为噩梦,尤其是在内核重启之后。 想象一下,全局定义了 `sigma`,然后在 `evaluate_mse` 函数中使用它,而没有将其作为参数传入。该函数最初可以工作,但是全局 `sigma` 的更改会意外地改变函数的输出。 Localscope 有助于防止这些问题,因为它限制了函数的作用域。通过使用 `@localscope` 装饰器,如果函数试图访问未声明的全局变量,它将引发异常。这迫使开发者显式定义所有必需的参数,确保函数的行为可预测且可重现,从而避免令人沮丧的调试过程并确保代码的可靠性。

Tillahoffmann分享了Localscope,一个Python包,用于检测函数中意外的全局变量访问,这是Jupyter Notebook中常见的问题,对于确保纯函数(例如JAX所需的函数)至关重要。Localscope反汇编函数以识别有问题的范围违规。 Nine_k强调了Localscope方法的潜在应用:通过限制函数对产生副作用的对象的访问来实现coeffects,就像Hack中那样。这可以保证特定的函数行为,例如通过阻止访问`open()`或将其限制为只读访问来防止文件写入。 Dleeftink评论了对这种工具的需求,指出范围外的变量很容易潜入Notebook代码中,从而导致重现性问题。他们质疑可重复的Notebook环境何时会实现内置的变量保护机制来解决这个问题。

原文

Have you ever hunted bugs caused by accidentally using a global variable in a function in a Jupyter notebook? Have you ever scratched your head because your code broke after restarting the Python kernel? localscope can help by restricting the variables a function can access.

Interactive python sessions are outstanding tools for analysing data, generating visualisations, and training machine learning models. However, the interactive nature allows global variables to leak into the scope of functions accidentally, leading to unexpected behaviour. For example, suppose you are evaluating the mean squared error between two lists of numbers, including a scale factor sigma.

>>> sigma = 7
>>> # [other notebook cells and bits of code]
>>> xs = [1, 2, 3]
>>> ys = [4, 5, 6]
>>> mse = sum(((x - y) / sigma) ** 2 for x, y in zip(xs, ys))
>>> mse
0.55102...

Everything works nicely, and you package the code in a function for later use but forget about the scale factor introduced earlier in the notebook.

>>> def evaluate_mse(xs, ys):  # missing argument sigma
...     return sum(((x - y) / sigma) ** 2 for x, y in zip(xs, ys))
>>>
>>> mse = evaluate_mse(xs, ys)
>>> mse
0.55102...

The variable sigma is obtained from the global scope, and the code executes without any issue. But the output is affected by changing the value of sigma.

>>> sigma = 13
>>> evaluate_mse(xs, ys)
0.15976...

This example may seem contrived. But unintended information leakage from the global scope to the local function scope often leads to unreproducible results, hours spent debugging, and many kernel restarts to identify the source of the problem. Localscope fixes this problem by restricting the allowed scope.

>>> @localscope
... def evaluate_mse(xs, ys):  # missing argument sigma
...     return sum(((x - y) / sigma) ** 2 for x, y in zip(xs, ys))
Traceback (most recent call last):
  ...
localscope.LocalscopeException: `sigma` is not a permitted global (file "...",
   line 3, in <genexpr>)
联系我们 contact @ memedata.com