Kubernetes 与 squid 代理的出站流量控制

原文

Kubernetes Egress Control with Squid proxy ¶

2025-12-28

This Way to the Egress!

— Sign at P.T. Barnum’s Americam Museum

Kubernetes ingress gets a lot of attention – Gateway API, Ingress controllers, service meshes – compared with the Egress, mostly ignored until someone asks “what exactly is our cluster talking to?”, or, in even simple deployments, “Can we see what we are talking to?”. This is a (very) simple approach to that, using the venerable Squid proxy and a NetworkPolicy, without reaching for heavier machinery (but beginning to understand why we would).

This is the overview of the thing I’m about to describe:

Squid as egress proxy in k3s

Why do I care ¶

Most Kubernetes tutorials focus on getting traffic into your cluster, which is fair since that’s where it usually starts... but traffic flows both ways, and once your workloads start making outbound calls to APIs, databases, and services beyond your cluster boundary, there’s a discussion on visibility and security to be had.

I ran into this while working with OpenShift’s egress policies years ago, in so-called “regulated industries”: while not the most flexible at the time, they were the most straightforward answer to security requirements that defined that outbound traffic should go through a proxy.

I’m using Kubernetes through k3s (mostly) and kind (often, for develpment) for my own personal stuff (see Projects), so I went back to basics on this: what if we just used Squid – a proxy that’s been solving this problem since 1996! – and enforced its usage with a NetworkPolicy? Nothing fancy, nothing “next-gen cloud-native” just a proxy with logs, and see where that got me?

Squid and k3s: the solution ¶

The architecture is deliberately simple:

┌────────────────────────────────────────────────────────────┐
│ Cluster                                                    │
│                                                            │
│  ┌─────────────────────┐      ┌────────────────────────┐   │
│  │ workload namespace  │      │ egress-proxy namespace │   │
│  │                     │      │                        │   │
│  │  ┌─────┐            │ :3128│  ┌───────┐             │   │
│  │  │ pod │ HTTP_PROXY ├──────┼─▶│ squid │─────────────┼───┼──▶ internet
│  │  └─────┘            │      │  └───────┘             │   │
│  │     │               │      │                        │   │
│  │     x blocked       │      └────────────────────────┘   │
│  │   (direct egress)   │                                   │
│  └─────────────────────┘                                   │
└────────────────────────────────────────────────────────────┘

Workloads configure HTTP_PROXY/HTTPS_PROXY environment variables pointing to Squid, and a NetworkPolicy on the workload namespace blocks direct egress, allowing traffic only to the proxy. Squid logs everything that passes through. That’s it, and this gives us:

Visibility: every outbound connection logged with timestamp, destination, bytes transferred
Enforcement: NetworkPolicy makes the proxy mandatory, not optional
Simplicity: no CNI plugins, no service mesh, no CRDs

The demo application ¶

To test this out, I’m using a small application I built: Horizons, a Common Lisp application using Datastar that displays the solar system and fetches data from NASA’s JPL Horizons API when you click on a planet. It’s a good test case because it makes real HTTPS calls to an external API – exactly the kind of traffic we want to observe. It’s a scaled-down version of DataSPICE, an app I made to test my Common Lisp SDK for Datastar and that uses NASA SPICE data for a 2D similation of the Cassini-Huygens probe.

It uses a multi-stage build process that ends up with a reasonably small binary, horizons-server at 16MB, which isn’t bad for an image-based language like Common Lisp (it can go into ~13MB with some more compression optimisations), inside a trixie-slim Debian image for a total of ~100MB (this can also be optimised, aggressively so).

Setting up Squid ¶

All files are in the Horizons repository, under k8s/ specifically. I’ll go through the main aspects here.

First, the egress-proxy namespace and Squid configuration:

apiVersion: v1
kind: Namespace
metadata:
  name: egress-proxy
  labels:
    purpose: egress-control

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: squid-config
  namespace: egress-proxy
data:
  squid.conf: |
    http_port 3128

    # File-based logging for persistence and analysis
    access_log /var/log/squid/access.log combined
    cache_log /var/log/squid/cache.log

    # No caching, that's not the focus (now, at least)
    cache deny all

    # Allow requests from private IP ranges (pod CIDRs)
    acl localnet src 10.0.0.0/8
    acl localnet src 172.16.0.0/12
    acl localnet src 192.168.0.0/16

    acl SSL_ports port 443
    acl Safe_ports port 80
    acl Safe_ports port 443
    acl CONNECT method CONNECT

    http_access deny !Safe_ports
    http_access deny CONNECT !SSL_ports
    http_access allow localnet
    http_access deny all
    
    # Optional: restrict to specific domains (uncomment to enforce)
    # acl allowed_domains .ssd.jpl.nasa.gov
    # http_access deny !allowed_domains

The localnet ACLs cover all RFC 1918 private IP space – your k3s pod CIDR will fall within one of these, adjust for other situations of course. In production, you might tighten this to your specific pod CIDR for defence in depth etc.

The Squid deployment ¶

The deployment needs a few things: an init container to fix permissions on the log directory (hostPath volumes are created as root, but Squid runs as the proxy user), and a sidecar to stream logs to stdout for kubectl logs:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: squid
  namespace: egress-proxy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: squid
  template:
    metadata:
      labels:
        app: squid
    spec:
      initContainers:
        - name: fix-permissions
          image: busybox:latest
          command: ["sh", "-c", "chown -R 13:13 /var/log/squid"]
          volumeMounts:
            - name: logs
              mountPath: /var/log/squid
      containers:
        - name: squid
          image: ubuntu/squid:latest
          ports:
            - containerPort: 3128
          volumeMounts:
            - name: config
              mountPath: /etc/squid/squid.conf
              subPath: squid.conf
            - name: logs
              mountPath: /var/log/squid

        - name: log-streamer
          image: busybox:latest
          command: ["sh", "-c", "touch /var/log/squid/access.log && tail -F /var/log/squid/access.log"]
          volumeMounts:
            - name: logs
              mountPath: /var/log/squid

      volumes:
        - name: config
          configMap:
            name: squid-config
        - name: logs
          hostPath:
            path: /var/log/squid-egress
            type: DirectoryOrCreate

The hostPath volume means logs persist on the node at /var/log/squid-egress/, which I use for offline analysis or feeding into external log aggregation, namely goaccess. This could be done inside the cluster as well, but for simple deployments I often do it at the host level (for both k8s and non-k8s workloads).

Enforcing proxy usage with NetworkPolicy ¶

This is where it stops being optional: the NetworkPolicy blocks direct egress from the workload namespace, allowing only DNS resolution and traffic to the proxy:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: enforce-egress-proxy
  namespace: horizons
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    # DNS resolution
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53

    # Squid proxy only
    - to:
        - namespaceSelector:
            matchLabels:
              purpose: egress-control
      ports:
        - protocol: TCP
          port: 3128

The purpose: egress-control label on the egress-proxy namespace is what the selector matches, cleaner than hardcoding namespace names. Without the proxy environment variables configured, workloads in this namespace cannot reach the outside world.

Configuring the workload ¶

How do they use the proxy will be application-dependent: in this example, I’m using the dexador Common Lisp HTTP client, and setting the proxy based on the environment variables (which are set in the deployment manifest):

    spec:
      containers:
        - name: horizons
          image: localhost:5000/horizons:latest
          env:
            - name: HTTP_PROXY
              value: "http://squid.egress-proxy.svc.cluster.local:3128"
            - name: HTTPS_PROXY
              value: "http://squid.egress-proxy.svc.cluster.local:3128"
            - name: NO_PROXY
              value: "localhost,127.0.0.1,.svc,.svc.cluster.local,10.0.0.0/8"

The NO_PROXY setting is important, since without it, internal service-to-service calls would try to route through Squid and fail.

One caveat: not all HTTP clients respect these environment variables automatically. Most do (curl, wget, Python requests, Go’s net/http), but some require explicit configuration. This is the case of Drakma, but even in Dexador (which does use them) I had to set the default proxy explicitly:

  (let* ((planet-plist (find planet-name *planets*
                             :key (lambda (p) (getf p :name))
                             :test #'string-equal))
         (horizons-id (when planet-plist
                        (getf planet-plist :horizons-id)))
         (dex:*default-proxy* (uiop:getenv "HTTPS_PROXY")))

The reason here is that I’m using SBCL’s save-lisp-and-die approach to build a binary, and this captures the env variables at compile time: I need to refresh them at runtime.

Seeing it work ¶

With everything deployed, watching the logs while clicking around the Horizons UI:

$ kubectl logs -f deploy/squid -n egress-proxy -c log-streamer
10.42.0.238 - - [27/Dec/2025:21:43:34 +0000] "CONNECT ssd.jpl.nasa.gov:443 HTTP/1.1" 200 5537 "-" "-" TCP_TUNNEL:HIER_DIRECT
10.42.0.238 - - [27/Dec/2025:21:43:45 +0000] "CONNECT ssd.jpl.nasa.gov:443 HTTP/1.1" 200 5537 "-" "-" TCP_TUNNEL:HIER_DIRECT

k9s showing the log-streamer container

Every call to the JPL Horizons API is logged. The TCP_TUNNEL:HIER_DIRECT indicates Squid is tunnelling the HTTPS connection directly: no SSL interception, just a pass-through that logs the destination.

What you see (and don’t see) ¶

For HTTPS traffic, Squid logs the CONNECT tunnel: the destination host and port, timestamp, and bytes transferred. You don’t see the full URL path, since that would require SSL interception (ssl-bump), which breaks end-to-end encryption and requires deploying CA certificates to all clients. That’s a different architecture with different trade-offs.

What you do get is still valuable though: "pod X talked to api.example.com:443 at 14:32, transferred 5KB." For compliance, debugging, and security auditing, that’s often enough, and it certainly is enough for my own purposes. It also brings you some lock-down features “for free”. For HTTP traffic you get the full URL.

Also worth noting: connection pooling affects log frequency. If your HTTP client keeps connections alive, you’ll see one CONNECT entry covering multiple requests. I added :keep-alive nil to Dexador’s dex:get to get a hit everytime I clicked a planet, but this will be dependent on the application code.

There’s also some latency introduced by adding a new hop. This shouldn’t be noticeable or an issue, but it will depend on the specifics of your application.

Adding GoAccess for real-time visualisation ¶

Tailing logs is fine for debugging, but for ongoing visibility, I use GoAccess provides a real-time web dashboard. Adding it as another sidecar:

        - name: goaccess
          image: allinurl/goaccess:latest
          command:
            - sh
            - -c
            - |
              while [ ! -f /var/log/squid/access.log ]; do sleep 1; done
              goaccess /var/log/squid/access.log \
                --log-format=SQUID \
                --real-time-html \
                --output=/var/www/goaccess/index.html \
                --port=7890
          ports:
            - containerPort: 7890
          volumeMounts:
            - name: logs
              mountPath: /var/log/squid

and expose it with a NodePort service or equivalent, and you have a live dashboard showing which external hosts your cluster is talking to, request rates, and traffic patterns

I mostly use goaccess in TUI mode, which is easily done since I’m storing the Squid logs in the host.

Limitations and where this leads ¶

This approach is intentionally minimal, and it has real limitations:

Application changes required: workloads must set proxy environment variables.
HTTP/HTTPS only: raw TCP, gRPC-over-HTTP/2, and other protocols need different handling.
Single point of configuration: one squid.conf for all namespaces.
No per-namespace self-service: teams can’t manage their own egress rules.

Linked with several of the above (mainly the centralised configuration) is that when using ACL rules to limit communication to external domains, these are cumulative: all namespaces will be able to communicate with all whitelisted domains, even if they only need to communicate with some of them.

These limitations point toward why more sophisticated solutions exist, after all; a follow-up article will explore using Squid’s include directive to enable per-namespace configuration, and in doing so, show why you’d eventually want a controller or operator to manage the complexity.

There are more questions that can be answered though: transparent interception for applications that can’t be configured, sidecar proxies for per-workload control, and eventually the full service mesh model... each step solves a real problem that the previous approach couldn’t handle, and the allure of adding “simple” things on top of each other starts to fade...

...but for many cases, especially when you just need to answer “what is my cluster talking to?” or enforce a fixed list of egress destinations, a proxy and a NetworkPolicy is enough.

It is for me, at least.