Utilize 衡量您的 GPU 执行有效工作的效率。

Utilize 衡量您的 GPU 执行有效工作的效率。
Utilyze measures how efficiently your GPU is doing useful work

原始链接: https://github.com/systalyze/utilyze

## Utilyze：精确的GPU利用率监控 Utilyze是Systalyze开发的一款工具，它比标准的`nvidia-smi`或`nvtop`等工具提供更精确的GPU效率衡量标准。与那些只能指示GPU是否*繁忙*的工具不同，Utilyze直接读取性能计数器来显示工作负载实际使用了多少GPU的*容量*。它会根据您的硬件、模型和工作负载估算可达到的利用率限制，帮助识别潜在的性能瓶颈。目前，Utilyze支持vLLM以及NVIDIA Ampere或更新型号GPU（A100、H100等）上的部分模型，需要CUDA Toolkit 11.0+。 Utilyze适用于Linux、macOS和Windows（后两者作为连接到Linux服务器的客户端）。安装通过脚本非常简单，但完全的性能分析可能需要root/管理员权限。它会匿名收集GPU配置数据以改进指标（可以禁用）。了解更多：[https://systalyze.com/utilyze/](https://systalyze.com/utilyze/)

对不起。

原文

Utilyze measures how efficiently your GPU is doing useful work, not just whether it's busy. It runs live against your workload with negligible overhead.

Standard tools like nvidia-smi and nvtop only check whether a kernel is running on the GPU. They can show 100% while your workload is using a tiny fraction of the hardware's real capacity.

Utilyze reads GPU performance counters directly to show what's actually being used, and provides an estimate of how far you can push utilization given a workload, model, and hardware. To learn more, read our blog post.

Utilyze is created by Systalyze.

Read this in other languages: 中文

Linux amd64 (arm64 support coming soon)
NVIDIA Ampere or newer GPU (A100, H100, H200, B200, RTX 3000+)
CUDA Toolkit 11.0+
sudo or CAP_SYS_ADMIN (see below), or privileged container

# macOS/Linux
curl -sSfL https://systalyze.com/utilyze/install.sh | sh

# Windows
iex (curl.exe -L https://systalyze.com/utilyze/install.ps1 | Out-String)

For macOS and Windows versions, Utilyze acts as a client for another Utilyze process running on a remote Linux machine with profiling capabilities. These do not require root nor any native libraries. On Windows, you may need to add an exception to executable path for Windows Defender and then reinstall Utilyze:

Add-MpPreference -ExclusionPath <INSTALL_DIR>
iex (curl.exe -L https://systalyze.com/utilyze/install.ps1 | Out-String)

Utilyze will likely require root for profiling capabilities depending on your host configuration (see below) and will prompt you for your password during installation to install it system-wide.

If CUPTI 12+ is not found, utlz will prompt you to install the latest release from PyPI on first run.

On a Linux machine with profiling capabilities, you can:

# monitor all GPUs for SOL metrics
sudo utlz

# monitor specific GPUs
sudo utlz --devices 0,2

# show discovered inference server endpoints per GPU
sudo utlz --endpoints

This starts a WebSocket server that listens for connections from other Utilyze processes on port 8079 by default. Further instances will automatically connect to the same server.

On a macOS/Windows machine, you can connect to a running server with:

utlz --connect <SERVER_URL>

Note that a single device ID can only be monitored by a single instance of utlz. This is due to the way NVIDIA's Perf SDK API handles device access.

Utilyze discovers running inference servers to detect which model is loaded on each GPU. It computes an attainable compute SOL ceiling (your realistic peak given that model and hardware).

Currently Utilyze only supports vLLM as a backend, with more (e.g. SGLang) coming soon. We are expanding model and hardware coverage over time; at present we support a subset of models on H100-80G and A100-80G GPUs within a node (up to 8 GPUs).

To enable this, Utilyze anonymously sends GPU configuration data to Systalyze's servers. Disable with UTLZ_DISABLE_METRICS=1.

By default, NVIDIA restricts GPU profiling counters to admin users. To allow non-root access, disable the restriction on the host and reboot:

echo 'options nvidia NVreg_RestrictProfilingToAdminUsers=0' | sudo tee /etc/modprobe.d/nvidia-profiling.conf
sudo reboot

After this, utlz can run without sudo. If utlz warns about missing capabilities, you can disable the warning via UTLZ_DISABLE_PROFILING_WARNING=1 (see Options).

Flags (most have environment variable equivalents):

--endpoints: show discovered inference server endpoints per GPU
--devices / UTLZ_DEVICES: monitor specific GPUs (comma-separated list of device IDs)
--log / UTLZ_LOG: a file to write logs to (default: no logging)
--log-level / UTLZ_LOG_LEVEL: set the log level (default: INFO, other options: DEBUG, WARN, ERROR)
--version: show the version

Environment variables only:

UTLZ_DISABLE_PROFILING_WARNING: disable the warning about GPU profiling capabilities on startup
UTLZ_BACKEND_URL: set the backend URL for Systalyze's roofline SOL metrics API (default: https://api.systalyze.com/v1/utilyze)
UTLZ_DISABLE_METRICS: disable workload detection and Systalyze roofline SOL metrics API

To build from source you'll need:

Go 1.25+ for the CLI
Docker for building the native library with wide compatibility
CUDA Toolkit (13.1 is linked against by default but can be set via CUDA_VERSION)

# build the native library and the CLI
make all

# build and package the native library via Docker
make dist-tarball-docker

# build the CLI only
make utlz

There is experimental support for ARM64 builds using the sbsa-linux CUDA target.

Utilize 衡量您的 GPU 执行有效工作的效率。 Utilyze measures how efficiently your GPU is doing useful work

Utilize 衡量您的 GPU 执行有效工作的效率。
Utilyze measures how efficiently your GPU is doing useful work