一个可以伪装成 Chrome 和 Firefox 的 curl 特殊版本
Curl-impersonate: Special build of curl that can impersonate the major browsers

原始链接: https://github.com/lwthiker/curl-impersonate

`curl-impersonate` 是一个特殊的 `curl` 版本,能够进行模拟主流浏览器(Chrome、Edge、Safari、Firefox)的 TLS 和 HTTP 握手,从而绕过指纹识别技术。它通过修改 `curl` 以使用浏览器特定的 TLS 库(Firefox 使用 NSS,Chrome 使用 BoringSSL),配置 TLS 扩展以及调整 HTTP/2 设置来实现这一点。 它可以用作命令行工具或库,提供 `curl_easy_impersonate` 函数来设置类似浏览器的选项。每个浏览器都存在包装脚本,提供预配置的标头和标志,但这些都是可以自定义的。预编译的二进制文件和 Docker 镜像可用,但需要 `nss` 和 `ca-certificates` 等依赖项。 该工具支持模拟特定版本的浏览器,其配置存储在 `browsers.json` 中。其主要用例是允许以编程方式访问那些通过检测 TLS 和 HTTP/2 握手差异来积极阻止非浏览器客户端的网站。

这个 Hacker News 讨论串讨论了一个特殊的 curl 版本,它试图模仿 Chrome 和 Firefox 浏览器。一位评论者指出 Chrome 使用 API 来验证请求,这使得 curl 版本很难真正模仿真实的 Chrome 浏览器。他们建议使用 curl 参数驱动的无头 Chrome 浏览器是理想的方案。 另一位评论者希望 Ladybird 浏览器能够获得更多关注,因为它目前使用标准的 cURL 进行网络连接,最终可能消除浏览器指纹识别,因为合法流量将共享 cURL 的指纹。后续评论建议 Ladybird 使用 cURL 可以改进 cURL 本身,因为它可以突出显示现实世界浏览器工作流程中缺少的功能,例如通过 h2 的 WebSockets。讨论还指出了 2022 年之前的 Show HN 提交,其中一个类似的模仿工具只关注 Firefox。
相关文章

原文

Build and test Docker images

A special build of curl that can impersonate the four major browsers: Chrome, Edge, Safari & Firefox. curl-impersonate is able to perform TLS and HTTP handshakes that are identical to that of a real browser.

curl-impersonate can be used either as a command line tool, similar to the regular curl, or as a library that can be integrated instead of the regular libcurl. See Usage below.

When you use an HTTP client with a TLS website, it first performs a TLS handshake. The first message of that handshake is called Client Hello. The Client Hello message that most HTTP clients and libraries produce differs drastically from that of a real browser.

If the server uses HTTP/2, then in addition to the TLS handshake there is also an HTTP/2 handshake where various settings are exchanged. The settings that most HTTP clients and libraries use differ as well from those of any real browsers.

For these reasons, some web services use the TLS and HTTP handshakes to fingerprint which client is accessing them, and then present different content for different clients. These methods are known as TLS fingerprinting and HTTP/2 fingerprinting respectively. Their widespread use has led to the web becoming less open, less private and much more restrictive towards specific web clients

With the modified curl in this repository, the TLS and HTTP handshakes look exactly like those of a real browser.

To make this work, curl was patched significantly to resemble a browser. Specifically, The modifications that were needed to make this work:

  • Compiling curl with nss, the TLS library that Firefox uses, instead of OpenSSL. For the Chrome version, compiling with BoringSSL, Google's TLS library.
  • Modifying the way curl configures various TLS extensions and SSL options.
  • Adding support for new TLS extensions.
  • Changing the settings that curl uses for its HTTP/2 connections.
  • Running curl with some non-default flags, for example --ciphers, --curves and some -H headers.

The resulting curl looks, from a network perspective, identical to a real browser.

Read the full technical description in the blog posts: part a, part b.

The following browsers can be impersonated.

This list is also available in the browsers.json file.

For each supported browser there is a wrapper script that launches curl-impersonate with all the needed headers and flags. For example:

curl_chrome116 https://www.wikipedia.org

You can add command line flags and they will be passed on to curl. However, some flags change curl's TLS signature which may cause it to be detected.

Please note that the wrapper scripts use a default set of HTTP headers. If you want to change these headers, you may want to modify the wrapper scripts to fit your own purpose.

See Advanced usage for more options, including using libcurl-impersonate as a library.

More documentation is available in the docs/ directory.

There are two versions of curl-impersonate for technical reasons. The chrome version is used to impersonate Chrome, Edge and Safari. The firefox version is used to impersonate Firefox.

Pre-compiled binaries for Linux and macOS (Intel) are available at the GitHub releases page. Before you use them you need to install nss (Firefox's TLS library) and CA certificates:

  • Ubuntu - sudo apt install libnss3 nss-plugin-pem ca-certificates
  • Red Hat/Fedora/CentOS - yum install nss nss-pem ca-certificates
  • Archlinux - pacman -S nss ca-certificates
  • macOS - brew install nss ca-certificates

Also ensure you have zlib installed on your system. zlib is almost always present, but on some minimal systems it might be missing.

The pre-compiled binaries contain libcurl-impersonate and a statically compiled curl-impersonate for ease of use.

The pre-compiled Linux binaries are built for Ubuntu systems. On other distributions if you have errors with certificate verification you may have to tell curl where to find the CA certificates. For example:

curl_chrome116 https://www.wikipedia.org --cacert /etc/ssl/certs/ca-bundle.crt

Also make sure to read Notes on Dependencies.

See INSTALL.md.

Docker images based on Alpine Linux and Debian with curl-impersonate compiled and ready to use are available on Docker Hub. The images contain the binary and all the wrapper scripts. Use like the following:

# Firefox version, Alpine Linux
docker pull lwthiker/curl-impersonate:0.6-ff
docker run --rm lwthiker/curl-impersonate:0.6-ff curl_ff109 https://www.wikipedia.org

# Chrome version, Alpine Linux
docker pull lwthiker/curl-impersonate:0.6-chrome
docker run --rm lwthiker/curl-impersonate:0.6-chrome curl_chrome110 https://www.wikipedia.org

AUR packages are available to Archlinux users:

Unofficial Homebrew receipts for Mac (Chrome only) are available here:

brew tap shakacode/brew
brew install curl-impersonate

libcurl-impersonate.so is libcurl compiled with the same changes as the command line curl-impersonate. It has an additional API function:

CURLcode curl_easy_impersonate(struct Curl_easy *data, const char *target,
                               int default_headers);

You can call it with the target names, e.g. chrome116, and it will internally set all the options and headers that are otherwise set by the wrapper scripts. If default_headers is set to 0, the built-in list of HTTP headers will not be set, and the user is expected to provide them instead using the regular CURLOPT_HTTPHEADER libcurl option.

Calling the above function sets the following libcurl options:

  • CURLOPT_HTTP_VERSION
  • CURLOPT_SSLVERSION, CURLOPT_SSL_CIPHER_LIST, CURLOPT_SSL_EC_CURVES, CURLOPT_SSL_ENABLE_NPN, CURLOPT_SSL_ENABLE_ALPN
  • CURLOPT_HTTPBASEHEADER, if default_headers is non-zero (this is a non-standard HTTP option created for this project).
  • CURLOPT_HTTP2_PSEUDO_HEADERS_ORDER, CURLOPT_HTTP2_NO_SERVER_PUSH (non-standard HTTP/2 options created for this project).
  • CURLOPT_SSL_ENABLE_ALPS, CURLOPT_SSL_SIG_HASH_ALGS, CURLOPT_SSL_CERT_COMPRESSION, CURLOPT_SSL_ENABLE_TICKET (non-standard TLS options created for this project).
  • CURLOPT_SSL_PERMUTE_EXTENSIONS (non-standard TLS options created for this project). Note that if you call curl_easy_setopt() later with one of the above it will override the options set by curl_easy_impersonate().

Using CURL_IMPERSONATE env var

If your application uses libcurl already, you can replace the existing library at runtime with LD_PRELOAD (Linux only). You can then set the CURL_IMPERSONATE env var. For example:

LD_PRELOAD=/path/to/libcurl-impersonate.so CURL_IMPERSONATE=chrome116 my_app

The CURL_IMPERSONATE env var has two effects:

  • curl_easy_impersonate() is called automatically for any new curl handle created by curl_easy_init().
  • curl_easy_impersonate() is called automatically after any curl_easy_reset() call.

This means that all the options needed for impersonation will be automatically set for any curl handle.

If you need precise control over the HTTP headers, set CURL_IMPERSONATE_HEADERS=no to disable the built-in list of HTTP headers, then set them yourself with curl_easy_setopt(). For example:

LD_PRELOAD=/path/to/libcurl-impersonate.so CURL_IMPERSONATE=chrome116 CURL_IMPERSONATE_HEADERS=no my_app

Note that the LD_PRELOAD method will NOT WORK for curl itself because the curl tool overrides the TLS settings. Use the wrapper scripts instead.

If you intend to copy the self-compiled artifacts to another system, or use the Pre-compiled binaries provided by the project, make sure that all the additional dependencies are met on the target system as well. In particular, see the note about the Firefox version.

This repository contains two main folders:

  • chrome - Scripts and patches for building the Chrome version of curl-impersonate.
  • firefox - Scripts and patches for building the Firefox version of curl-impersonate.

The layout is similar for both. For example, the Firefox directory contains:

  • Dockerfile - Used to build curl-impersonate with all dependencies.
  • curl_ff91esr, curl_ff95, curl_ff98 - Wrapper scripts that launch curl-impersonate with the correct flags.
  • curl-impersonate.patch - The main patch that makes curl use the same TLS extensions as Firefox. Also makes curl compile statically with libnghttp2 and libnss.

Other files of interest:

  • tests/signatures - YAML database of known browser signatures that can be impersonated.

If you'd like to help, please check out the open issues. You can open a pull request with your changes.

This repository contains the build process for curl-impersonate. The actual patches to curl are maintained in a separate repository forked from the upstream curl. The changes are maintained in the impersonate-firefox and impersonate-chrome branches.

Sponsors help keep this project open and maintained. If you wish to become a sponsor, please contact me directly at: lwt at lwthiker dot com.

Logo
联系我们 contact @ memedata.com