展示HN:Goxe 在I5处理器上每秒处理19k条日志
Show HN: Goxe 19k Logs/S on an I5

原始链接: https://github.com/DumbNoxx/goxe

## Goxe:高性能日志压缩 Goxe是一个基于Go的工具,旨在通过将重复消息聚合为简洁、易读的摘要来减少日志量。它通过syslog/UDP接收日志,通过去除时间戳、转换为小写、删除空格和过滤不需要的术语来规范化日志。然后,相同消息被分组并报告出现次数,从而减少噪音和存储成本,同时不丢失关键信息。 Goxe专为持续后台运行而设计,具有可配置的管道,提供过滤、自动报告和远程日志传输(通过JSON格式的TCP/UDP)选项。它支持配置文件的热重载,并包含事件突发检测和优雅关闭等功能。 Goxe默认监听UDP端口1729,并在首次运行时创建一个默认配置文件。虽然目前缺乏官方Docker支持,但它正在积极优化性能,并持续努力减少内存分配和GC压力。 采用Apache 2.0许可。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 展示 HN: Goxe 在 I5 上每秒 19k 条日志 (github.com/dumbnoxx) 9 分,由 nxus_dev 1 天前发布 | 隐藏 | 过去 | 收藏 | 1 条评论 nxus_dev 1 天前 [–] Goxe 在普通硬件上达到每秒 19k 条日志,证明处理大量流量不需要大型服务器。规格: CPU: i5-8250U @ 3.40 GHz RAM: 16GB (低占用) 内核: Linux 6.18-zen (Arch) 结果:高效的日志减少,边缘就绪,并且由近乎零分配开销提供支持。回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

Made with VHS

reduce large volumes of repetitive logs into compact, readable aggregates.

goxe is a high-performance log reduction tool written in go. it ingests logs (currently via syslog/udp), normalizes and filters them, and aggregates repeated messages into a single-line format with occurrence counts. the result is less noise, lower bandwidth usage, and cheaper storage without losing visibility into recurring issues.

goxe is designed to run continuously in the background as part of a logging pipeline or sidecar.

  • go 1.25.5 or higher (to build from source)

goxe performs several transformations before aggregation:

  • strips timestamps and date prefixes
  • converts text to lowercase
  • removes extra whitespace
  • filters out configurable excluded words
  • applies basic ascii beautification

after normalization, identical messages are grouped together and reported with repetition counts.

example input:

dec 24, 2025 16:30:17 ERROR: connection failed 001 128.54.69.12
dec 24, 2025 16:30:18 ERROR: connection failed 002 128.34.70.12
dec 24, 2025 16:30:19 ERROR: connection failed 003 128.54.69.12

aggregated output:

        partial report
----------------------------------
origin: [::1]
- [3] ERROR: connection failed *  -- (first seen 16:30:17 - last seen 16:30:19)
----------------------------------
  • worker pool for parallel processing
  • thread-safe state management
  • automated partial reporting
  • log normalization and filtering
  • ascii beautification
  • timestamp and date parsing
  • graceful shutdown and signal handling
  • similarity clustering (group near-identical messages)
  • syslog/udp network ingestion
  • configuration file support
  • output log file
  • firstseen field to track initial occurrence
  • event burst detection
  • notification dispatch pipeline
  • remote syslog/network shipping support
  • default behavior:

    • goxe listens on udp port 1729 by default (configurable).
    • on first run the tool creates a default config.json in the user's config directory:
      • linux: $XDG_CONFIG_HOME or $HOME/.configgoxe/config.json
      • macos: ~/Library/Application Support/goxe/config.json
      • windows: %APPDATA%\goxe\config.json
    • the app reads options.Config from that file; the defaults are:
      • port: 1729 — udp port to listen on
      • idLog: hostname — identifier added/removed from logs
      • pattenersWords: [] — list of ignored words
      • generateLogsOptions.generateLogsFile: false — write periodic file report
      • generateLogsOptions.hour: "00:00:00" — scheduled hour for file generation
      • webhookUrls: [] — webhooks to call when alerts fire
      • bursDetectionOptions.limitBreak: 10 — burst detection threshold (seconds × count)
      • shipper.address: "" — remote address to ship processed logs (e.g., "127.0.0.1:5000")
      • shipper.flushInterval: 30 — interval in seconds between network transmissions
      • shipper.protocol: "tcp" — transmission protocol (tcp, udp, etc. via net.Dial)
      • ReportInterval: The interval in minutes for generating summaries of processed logs.
      • BufferUdpSize: The size of the UDP buffer for receiving logs via UDP.
    • hot reloading: goxe monitors the config.json file in real-time. Any changes saved to the file are automatically applied without requiring a restart.
  • routing and shipping:

    • ingestion: configure your system logger (rsyslog, syslog-ng, etc.) or any application to forward logs to udp://<host>:1729.
    • remote shipping: enable shipper.address to forward processed log aggregates to an external service. Goxe will batch and send statistics in JSON format:
      {
        "origin": "web-server-01",
        "data": [
          {
            "count": 42,
            "firstSeen": "2024-03-20T10:00:00Z",
            "lastSeen": "2024-03-20T10:05:00Z",
            "message": "Invalid password attempt for user admin"
          }
        ]
      }
  • app integration:

    • system-wide: see your OS documentation for forwarding syslog to a remote UDP port (Linux, macOS, Windows).
    • custom apps: any app capable of sending UDP/Syslog packets can use Goxe as a target:
      • node: use a syslog/bunyan/winston transport to forward logs.
      • go: use the std net package to dial UDP.
    • note: docker support is not available yet running goxe in a container is not officially supported in this release.
  • benchmark runs (example) can be added as images to show before/after results. placeholder below:

benchmark results placeholder

  • note on allocs: current benchmarking shows ~2 allocs/op in the udp ingestion + processing path. this is expected with the current api because:
    • one allocation is typically the creation of the normalized key (the sanitized string used as the map key),
    • the other allocation can come from creating a new logstats entry for a brand-new message key.
  • how to reduce further:
    • change the pipeline to process bytes instead of strings (breaking change) or use a hash/interning strategy for keys, which avoids per-message string allocations for repeated messages.
    • optimize sanitizer to do a single-pass transformation into a pooled builder to avoid intermediate temporaries.
  • the above optimizations are planned; this release focuses on fixing per-message regex recompiles, adding shared pools and safe zero-copy buffer ownership to reduce gc pressure.

licensed under the apache license, version 2.0. see the license file for details.

联系我们 contact @ memedata.com