安全的Shell字符串插值

安全的Shell字符串插值
Safe Shell String Interpolation

原始链接: https://samwing.dev/posts/zJonlrXabVoc

本文档讨论了在使用用户输入构建shell命令时（特别是创建彩虹问候横幅）的shell注入安全漏洞。它首先介绍了一种使用字符串插值的天真方法，这种方法暴露了shell注入的风险。然后，它探讨了JavaScript、Python和Swift中更安全的替代方案。解决方案包括使用`execFile`代替`exec`来精确传递参数，以及使用环境变量来安全地将输入传递给shell命令。JavaScript的标签模板和Swift的`ExpressibleByStringInterpolation`提供了安全插值的原生机制。Python演示了一种基本的环境变量方法，并提到了未来用于模板式字符串插值的PEP 750提案。文档最后给出了一个高度不安全的Python实现，它使用代码操作来模拟安全的插值，强调了其中涉及的危险和复杂性。

在Hacker News的一个讨论安全Shell字符串插值的帖子中，一位评论者建议在Python中使用`shlex.quote`（或等效方法）来安全地引用在shell脚本中插值变量时的参数，并提倡使用环境变量或`printf`作为替代方案。他们还提到了Python提出的模板字符串（t-字符串）作为未来可能的解决方案。另一位评论者反对将设计用于命令行交互的shell用于脚本编写，强调由于对可信输入的假设而带来的固有风险。他们强调在脚本编写时需要更严格的语言，并警告随着用户利用LLM生成shell代码而对代码细微之处缺乏充分理解，其危险性日益增加。讨论围绕着以编程方式与shell命令交互时安全字符串处理的漏洞和最佳实践展开。

（评论） 2024-05-23

（评论） 2024-09-18

Python 的预处理器 2024-08-24

Bra和KET：AmigaDOS中的字符串插值 2025-03-22

我用来为 shell 脚本创建出色的用户体验的技术 2024-09-15

原文

Let's take the toy problem of getting the user's name and printing out a rainbow greeting banner. An inexperienced developer might solve the problem like this:

import { exec } from "node:child_process"
import { promisify } from "node:util"
const execPromise = promisify(exec)

const username = prompt("Hello, what's your name?")
const banner = (await execPromise(`figlet "Welcome, ${username}" | lolcat -f`)).stdout // security vulnerability!
console.log(banner)

This works as you might expect:

However, you might notice the bug when you run this with untrusted user input!

The constructed command was:

figlet "Welcome, "; ps; echo "" | lolcat -f

This kind of bug makes its way into production surprisingly often, although usually with trusted input. Even with trusted input, it can be a problem because you get strange bugs when your arguments contain spaces or something.

The obvious solution is to use execFile rather than exec and pass the arguments directly to the command with no shells parsing user input. It looks like this:

import { execFile } from "node:child_process"
import { promisify } from "node:util"
const execFilePromise = promisify(execFile)

const username = prompt("Hello, what's your name?")
// now we have to spawn the two processes and pipe the output of figlet into lolcat manually:
const lolcat = execFilePromise("lolcat", ["-f"])
const figlet = execFile("figlet", [`Welcome, ${username}`]).stdout.pipe(lolcat.child.stdin)
console.log((await lolcat).stdout)

I think this is a rather ugly solution. An alternative is to continue using shells, but pass the inputs in as environment variables:

import { exec } from "node:child_process"
import { promisify } from "node:util"
const execPromise = promisify(exec)

const username = prompt("Hello, what's your name?")
const banner = (await execPromise('figlet "Welcome, $username" | lolcat -f', { env: { ...process.env, username } })).stdout
console.log(banner)

We get the same, correct, result.

In JavaScript, we can take this one step further! Tagged templates allow you to write functions that receive the arguments to a template literal and return whatever you want. Here's an example implementation of safe interpolation:

import { exec as exec } from "child_process"
import { promisify } from "util"
const execPromise = promisify(exec)

async function shell(fragments: TemplateStringsArray, ...values: unknown[]) {
  const env = { ...process.env }
  const command = fragments.reduce((constructedCommand, fragment, i) => {
    if (i < values.length) {
      const varName = `_val${i + 1}`
      env[varName] = `${values[i]}`
      return constructedCommand + fragment + `$\{${varName}}`
    }
    return constructedCommand + fragment
  }, "")

  return await execPromise(command, { env })
}

const username = prompt("Hello, what's your name?")
const banner = (await shell`figlet "Welcome, ${username}" | lolcat -f`).stdout
console.log(banner)

This allows you to safely do string interpolation with untrusted user input on shell commands! There are still plenty of footguns depending on the command you're using, like how the env command will start parsing out args if you start the first arg with -S . In general, be careful that the command you're passing untrusted user input to doesn't do unexpected things when given maliciously-crafted input.

Now how about in other languages? In most cases, the best you can do is the environment variable method. Here's an example in Python:

import subprocess
import os


def shell(command, **values):
    return subprocess.run(
        command,
        shell=True,
        check=True,
        text=True,
        capture_output=True,
        env=os.environ | values,
    )


username = input("Hello, what's your name? ")
banner = shell('figlet "Welcome, $username" | lolcat -f', username=username)
print(banner.stdout)

There is a proposal, PEP 750, that would enable something like:

shell(t'figlet "Welcome, {username}" | lolcat -f')

It is expected to land in Python 3.14.

Swift, however, is the only language other than JavaScript I have identified that has the equivalent feature today. It allows you to write structs extending ExpressibleByStringInterpolation in a similar fashion to JavaScript's tagged templates.

I had not written a line of swift since I was eleven years old, and I didn't want to go learn Swift for a quick proof-of-concept, so this code is largely LLM-assisted.

import Foundation

struct Shell: ExpressibleByStringInterpolation {
    private var command: String
    private var env: [String: String]
    var result: String

    struct Interpolation: StringInterpolationProtocol {
        var command = ""
        var env: [String: String] = [:]
        private var argCount = 0

        init(literalCapacity: Int, interpolationCount: Int) {}

        mutating func appendLiteral(_ literal: String) {
            command += literal
        }

        mutating func appendInterpolation(_ value: Any) {
            argCount += 1
            let varName = "_val\(argCount)"
            env[varName] = "\(value)"
            command += "${\(varName)}"
        }
    }

    init(stringLiteral value: String) {
        command = value
        env = [:]
        // We can't use the run() function until all of our properties are initialized
        // so we need to set result to "" first.
        result = ""
        // I'm not totally sure how this try? syntax works, the LLM generated it
        // Presumably, it does run() and returns null if it fails, then we optional-chain to "" if it fails
        result = (try? run()) ?? ""
    }

    init(stringInterpolation: Interpolation) {
        command = stringInterpolation.command
        env = stringInterpolation.env
        result = ""
        result = (try? run()) ?? ""
    }

    private func run() throws -> String {
        let process = Process()
        process.executableURL = URL(fileURLWithPath: "/bin/sh")
        process.arguments = ["-c", command]

        var environment = ProcessInfo.processInfo.environment
        for (key, value) in env {
            environment[key] = value
        }
        process.environment = environment

        // Combining stdout and stderr is different
        // than what I did in the Python and JS implementations.
        // This function is purely LLM-generated (excluding this comment).
        let pipe = Pipe()
        process.standardOutput = pipe
        process.standardError = pipe

        try process.run()
        process.waitUntilExit()

        let data = pipe.fileHandleForReading.readDataToEndOfFile()
        return String(data: data, encoding: .utf8) ?? ""
    }
}

print("What is your name?", terminator: " ")
let username = readLine()
// Swift doesn't have single-quoted strings, so we need to escape the double quotes
// in the figlet command.
let banner: Shell = "figlet \"Welcome, \(username!)\" | lolcat -f"
print(banner.result)

I wanted to see if it was possible to make shell(f'figlet "Welcome, {username}" | lolcat -f') not vulnerable to shell injection. Nobody should ever use this, but I managed to bludgeon it into doing what I wanted. It uses a decorator function that grabs the function's source code, manipulates it with regexes (I suppose you could use the AST for this, but editing code with regex makes the fun even more fun – nobody is putting this in production hopefully), and executes the newly updated code.

import inspect
import os
import re
import subprocess
import uuid


def with_shell(f):
    def wrapper(*args, **kwargs):
        # Horrifying hack within a horrifying hack:
        # Delete the first line of the function source code,
        # which is the decorator itself.
        # This is necessary to avoid infinite recursion.
        # It will break if there's a decorator above this one.
        function_source = "\n".join(inspect.getsource(f).splitlines()[1::])
        pattern = r'shell\(f(["\']{1,3})(.*?)\1\)'
        # We could replace all of the interpolated values with "?" or something,
        # but what if the command itself contains a "?"?
        # Instead, we generate an unpredictable placeholder.
        placeholder = str(uuid.uuid4())

        def replace_shell_invocation(match):
            cmd = match.group(2)
            placeholders = []

            def replace_braces(match):
                placeholders.append(match.group(1))
                return placeholder

            cmd_no_braces = re.sub(r"\{(.+?)\}", replace_braces, cmd)
            placeholders_str = ", ".join(placeholders)
            if placeholders_str:
                return f'shell.internal({repr(cmd_no_braces)}, "{placeholder}", {placeholders_str})'
            return f'shell.internal({repr(cmd_no_braces)}, "{placeholder}")'

        new_code = re.sub(
            pattern, replace_shell_invocation, function_source, flags=re.DOTALL
        )
        exec(compile(new_code, f.__code__.co_filename, "exec"), f.__globals__)
        # We can't use f(*args, **kwargs) because it would call the original function,
        # so we need to eval its name to get a reference to the newly-generated function
        # from the exec.
        eval(f.__name__)(*args, **kwargs)

    return wrapper


def shell(command: str) -> str:
    """
    Run a shell command and return the output. Prevents shell injection.

    Args:
        command: The shell command to run

    Returns:
        The output of the shell command
    """
    raise NotImplementedError("Make sure you're using the with_shell decorator")


def shell_internal(command, placeholder, *args):
    i = 0
    env = {}
    while placeholder in command:
        var_name = f"_val{i}"
        env[var_name] = args[i]
        command = command.replace(placeholder, f"${{{var_name}}}")
        i += 1
    return subprocess.run(
        command,
        shell=True,
        check=True,
        text=True,
        capture_output=True,
        env=os.environ | env,
    )


shell.internal = shell_internal


@with_shell
def main():
    username = input("Hello, what's your name? ")
    banner = shell(f'figlet "Welcome, {username}" | lolcat -f')
    print(banner.stdout)


main()

安全的Shell字符串插值 Safe Shell String Interpolation

安全的Shell字符串插值
Safe Shell String Interpolation