Kubernetes Egress Control with Squid proxy ¶
2025-12-28
This Way to the Egress!
— Sign at P.T. Barnum’s Americam Museum
Kubernetes ingress gets a lot of attention – Gateway API, Ingress controllers, service meshes – compared with the Egress, mostly ignored until someone asks “what exactly is our cluster talking to?”, or, in even simple deployments, “Can we see what we are talking to?”. This is a (very) simple approach to that, using the venerable Squid proxy and a NetworkPolicy, without reaching for heavier machinery (but beginning to understand why we would).
This is the overview of the thing I’m about to describe:
Squid as egress proxy in k3s
Why do I care ¶
Most Kubernetes tutorials focus on getting traffic into your cluster, which is fair since that’s where it usually starts... but traffic flows both ways, and once your workloads start making outbound calls to APIs, databases, and services beyond your cluster boundary, there’s a discussion on visibility and security to be had.
I ran into this while working with OpenShift’s egress policies years ago, in so-called “regulated industries”: while not the most flexible at the time, they were the most straightforward answer to security requirements that defined that outbound traffic should go through a proxy.
I’m using Kubernetes through k3s (mostly) and kind (often, for develpment) for my own personal stuff (see Projects), so I went back to basics on this: what if we just used Squid – a proxy that’s been solving this problem since 1996! – and enforced its usage with a NetworkPolicy? Nothing fancy, nothing “next-gen cloud-native” just a proxy with logs, and see where that got me?
Squid and k3s: the solution ¶
The architecture is deliberately simple:
┌────────────────────────────────────────────────────────────┐ │ Cluster │ │ │ │ ┌─────────────────────┐ ┌────────────────────────┐ │ │ │ workload namespace │ │ egress-proxy namespace │ │ │ │ │ │ │ │ │ │ ┌─────┐ │ :3128│ ┌───────┐ │ │ │ │ │ pod │ HTTP_PROXY ├──────┼─▶│ squid │─────────────┼───┼──▶ internet │ │ └─────┘ │ │ └───────┘ │ │ │ │ │ │ │ │ │ │ │ x blocked │ └────────────────────────┘ │ │ │ (direct egress) │ │ │ └─────────────────────┘ │ └────────────────────────────────────────────────────────────┘
Workloads configure HTTP_PROXY/HTTPS_PROXY environment
variables pointing to Squid, and a NetworkPolicy on the workload namespace
blocks direct egress, allowing traffic only to the proxy. Squid logs
everything that passes through. That’s it, and this gives us:
- Visibility: every outbound connection logged with timestamp, destination, bytes transferred
- Enforcement: NetworkPolicy makes the proxy mandatory, not optional
- Simplicity: no CNI plugins, no service mesh, no CRDs
The demo application ¶
To test this out, I’m using a small application I built: Horizons, a Common Lisp application using Datastar that displays the solar system and fetches data from NASA’s JPL Horizons API when you click on a planet. It’s a good test case because it makes real HTTPS calls to an external API – exactly the kind of traffic we want to observe. It’s a scaled-down version of DataSPICE, an app I made to test my Common Lisp SDK for Datastar and that uses NASA SPICE data for a 2D similation of the Cassini-Huygens probe.
It uses a multi-stage build process that ends up with a reasonably small binary, horizons-server at 16MB, which
isn’t bad for an image-based language like Common Lisp (it can go into ~13MB with some more compression optimisations),
inside a trixie-slim Debian image for a total of ~100MB (this can also be optimised, aggressively so).
Setting up Squid ¶
All files are in the Horizons repository, under k8s/ specifically.
I’ll go through the main aspects here.
First, the egress-proxy namespace and Squid configuration:
apiVersion: v1
kind: Namespace
metadata:
name: egress-proxy
labels:
purpose: egress-control
---
apiVersion: v1
kind: ConfigMap
metadata:
name: squid-config
namespace: egress-proxy
data:
squid.conf: |
http_port 3128
# File-based logging for persistence and analysis
access_log /var/log/squid/access.log combined
cache_log /var/log/squid/cache.log
# No caching, that's not the focus (now, at least)
cache deny all
# Allow requests from private IP ranges (pod CIDRs)
acl localnet src 10.0.0.0/8
acl localnet src 172.16.0.0/12
acl localnet src 192.168.0.0/16
acl SSL_ports port 443
acl Safe_ports port 80
acl Safe_ports port 443
acl CONNECT method CONNECT
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localnet
http_access deny all
# Optional: restrict to specific domains (uncomment to enforce)
# acl allowed_domains .ssd.jpl.nasa.gov
# http_access deny !allowed_domains
The localnet ACLs cover all RFC 1918 private IP space – your
k3s pod CIDR will fall within one of these, adjust for other situations of course. In production, you might tighten this
to your specific pod CIDR for defence in depth etc.
The Squid deployment ¶
The deployment needs a few things: an init container to fix permissions on the log directory (hostPath volumes are
created as root, but Squid runs as the proxy user), and a sidecar to stream logs to stdout for kubectl
logs:
apiVersion: apps/v1
kind: Deployment
metadata:
name: squid
namespace: egress-proxy
spec:
replicas: 1
selector:
matchLabels:
app: squid
template:
metadata:
labels:
app: squid
spec:
initContainers:
- name: fix-permissions
image: busybox:latest
command: ["sh", "-c", "chown -R 13:13 /var/log/squid"]
volumeMounts:
- name: logs
mountPath: /var/log/squid
containers:
- name: squid
image: ubuntu/squid:latest
ports:
- containerPort: 3128
volumeMounts:
- name: config
mountPath: /etc/squid/squid.conf
subPath: squid.conf
- name: logs
mountPath: /var/log/squid
- name: log-streamer
image: busybox:latest
command: ["sh", "-c", "touch /var/log/squid/access.log && tail -F /var/log/squid/access.log"]
volumeMounts:
- name: logs
mountPath: /var/log/squid
volumes:
- name: config
configMap:
name: squid-config
- name: logs
hostPath:
path: /var/log/squid-egress
type: DirectoryOrCreate
The hostPath volume means logs persist on the node at /var/log/squid-egress/, which I use for offline
analysis or feeding into external log aggregation, namely goaccess. This could be done inside the cluster as well, but
for simple deployments I often do it at the host level (for both k8s and non-k8s workloads).
Enforcing proxy usage with NetworkPolicy ¶
This is where it stops being optional: the NetworkPolicy blocks direct egress from the workload namespace, allowing only DNS resolution and traffic to the proxy:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: enforce-egress-proxy
namespace: horizons
spec:
podSelector: {}
policyTypes:
- Egress
egress:
# DNS resolution
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
# Squid proxy only
- to:
- namespaceSelector:
matchLabels:
purpose: egress-control
ports:
- protocol: TCP
port: 3128
The purpose: egress-control label on the egress-proxy namespace is what the selector matches, cleaner than
hardcoding namespace names. Without the proxy environment variables configured, workloads in this namespace cannot reach
the outside world.
Configuring the workload ¶
How do they use the proxy will be application-dependent: in this example, I’m using the dexador Common Lisp HTTP client, and setting the proxy based on the environment variables (which are set in the deployment manifest):
spec:
containers:
- name: horizons
image: localhost:5000/horizons:latest
env:
- name: HTTP_PROXY
value: "http://squid.egress-proxy.svc.cluster.local:3128"
- name: HTTPS_PROXY
value: "http://squid.egress-proxy.svc.cluster.local:3128"
- name: NO_PROXY
value: "localhost,127.0.0.1,.svc,.svc.cluster.local,10.0.0.0/8"
The NO_PROXY setting is important, since without it, internal service-to-service calls would try to route through
Squid and fail.
One caveat: not all HTTP clients respect these environment variables automatically. Most do (curl, wget, Python requests, Go’s net/http), but some require explicit configuration. This is the case of Drakma, but even in Dexador (which does use them) I had to set the default proxy explicitly:
(let* ((planet-plist (find planet-name *planets* :key (lambda (p) (getf p :name)) :test #'string-equal)) (horizons-id (when planet-plist (getf planet-plist :horizons-id))) (dex:*default-proxy* (uiop:getenv "HTTPS_PROXY")))
The reason here is that I’m using SBCL’s save-lisp-and-die approach to build a binary, and this captures the env
variables at compile time: I need to refresh them at runtime.
Seeing it work ¶
With everything deployed, watching the logs while clicking around the Horizons UI:
$ kubectl logs -f deploy/squid -n egress-proxy -c log-streamer 10.42.0.238 - - [27/Dec/2025:21:43:34 +0000] "CONNECT ssd.jpl.nasa.gov:443 HTTP/1.1" 200 5537 "-" "-" TCP_TUNNEL:HIER_DIRECT 10.42.0.238 - - [27/Dec/2025:21:43:45 +0000] "CONNECT ssd.jpl.nasa.gov:443 HTTP/1.1" 200 5537 "-" "-" TCP_TUNNEL:HIER_DIRECT
k9s showing the log-streamer container
Every call to the JPL Horizons API is logged. The TCP_TUNNEL:HIER_DIRECT indicates Squid is tunnelling the HTTPS
connection directly: no SSL interception, just a pass-through that logs the destination.
What you see (and don’t see) ¶
For HTTPS traffic, Squid logs the CONNECT tunnel: the destination host and port, timestamp, and bytes
transferred. You don’t see the full URL path, since that would require SSL interception (ssl-bump), which breaks
end-to-end encryption and requires deploying CA certificates to all clients. That’s a different architecture with
different trade-offs.
What you do get is still valuable though: "pod X talked to api.example.com:443 at 14:32, transferred 5KB." For compliance, debugging, and security auditing, that’s often enough, and it certainly is enough for my own purposes. It also brings you some lock-down features “for free”. For HTTP traffic you get the full URL.
Also worth noting: connection pooling affects log frequency. If your HTTP client keeps connections alive, you’ll see one
CONNECT entry covering multiple requests. I added :keep-alive nil to Dexador’s dex:get to get a hit
everytime I clicked a planet, but this will be dependent on the application code.
There’s also some latency introduced by adding a new hop. This shouldn’t be noticeable or an issue, but it will depend on the specifics of your application.
Adding GoAccess for real-time visualisation ¶
Tailing logs is fine for debugging, but for ongoing visibility, I use GoAccess provides a real-time web dashboard. Adding it as another sidecar:
- name: goaccess
image: allinurl/goaccess:latest
command:
- sh
- -c
- |
while [ ! -f /var/log/squid/access.log ]; do sleep 1; done
goaccess /var/log/squid/access.log \
--log-format=SQUID \
--real-time-html \
--output=/var/www/goaccess/index.html \
--port=7890
ports:
- containerPort: 7890
volumeMounts:
- name: logs
mountPath: /var/log/squid
and expose it with a NodePort service or equivalent, and you have a live dashboard showing which external hosts your cluster is talking to, request rates, and traffic patterns
I mostly use goaccess in TUI mode, which is easily done since I’m storing the Squid logs in the host.
Limitations and where this leads ¶
This approach is intentionally minimal, and it has real limitations:
- Application changes required: workloads must set proxy environment variables.
- HTTP/HTTPS only: raw TCP, gRPC-over-HTTP/2, and other protocols need different handling.
- Single point of configuration: one squid.conf for all namespaces.
- No per-namespace self-service: teams can’t manage their own egress rules.
Linked with several of the above (mainly the centralised configuration) is that when using ACL rules to limit communication to external domains, these are cumulative: all namespaces will be able to communicate with all whitelisted domains, even if they only need to communicate with some of them.
These limitations point toward why more sophisticated solutions exist, after all; a follow-up article will explore using
Squid’s include directive to enable per-namespace configuration, and in doing so, show why you’d eventually want
a controller or operator to manage the complexity.
There are more questions that can be answered though: transparent interception for applications that can’t be configured, sidecar proxies for per-workload control, and eventually the full service mesh model... each step solves a real problem that the previous approach couldn’t handle, and the allure of adding “simple” things on top of each other starts to fade...
...but for many cases, especially when you just need to answer “what is my cluster talking to?” or enforce a fixed list of egress destinations, a proxy and a NetworkPolicy is enough.
It is for me, at least.