Following up with concrete operational cost data you suggested was important. I ran both implementations ingesting 1M certificates and performing monitor-style read operations:
Write Costs (1M certificates):
CompactLog: 12,847 storage PUTs
Sunlight: 287,364 storage PUTs
22.4x more expensive writes
Read Costs (Full tree sync, 1000 iterations):
CompactLog: 82,025 GETs total (mostly cache hits after first sync)
Sunlight: 41,030,000 GETs (41,030 per sync × 1000)
500x more expensive reads
This exposes fundamental architectural issues with "independent read/write paths." The system lacks application-level caching, meaning every monitor request hits storage directly. This design is vulnerable to denial-of-funds attacks where attackers can directly drive up S3 costs. Additionally, it requires an expensive CDN, which ironically couples the paths that are claimed to be independent. Finally, this architecture cannot achieve 0 MMD (Maximum Merge Delay) because independent paths inherently require synchronization delay between them.
Most critically, CompactLog's 0 MMD strengthens CT's security model. When SCTs are issued, certificates are immediately visible to monitors - no window for undetected misissuance. The "independent paths" architecture makes this impossible by design.
It's worth noting that the operators most vocally advocating for static CT appear to have infrastructure sponsorship arrangements that shield them from these costs. When storage and bandwidth are free, a any difference in operational costs becomes irrelevant. But this creates a distorted view of architectural viability - what works with sponsored infrastructure doesn't translate to sustainable operations for the broader ecosystem.
More concerning is the narrative that "scale requires direct object storage serving" - a claim that these benchmarks definitively disprove. When we accept that direct S3 serving is "the only way to scale," we're essentially mandating architectural decisions that maximize cloud provider revenue.
This raises a fundamental question: why are we advocating for this model? The static CT API is objectively more complex than RFC 6962, its "pure" deployment model is economically unviable without sponsorship, and it weakens security guarantees (MMD > 0). Yet there's a push to deprecate RFC 6962 - a working, proven standard - in favor of an architecture that's worse on every measurable dimension except ideological purity.
What's particularly troubling is that the "direct storage serving" approach is essentially brute force engineering - throwing unlimited infrastructure at a problem instead of solving it properly. Caching, request coalescing, and memory management aren't complex optimizations; they're basic engineering practices. When we champion architectures that prohibit these fundamental techniques, we're not promoting simplicity - we're mandating inefficiency.
The insidious part is that this design has tremendous surface appeal. "CT logs served directly from S3" sounds innovative and elegant. Object storage is familiar, reliable, and scalable - who wouldn't support that? Most people hear the pitch and think "brilliant!" without digging deeper into the implications.
It's only when you run the numbers or try to operate it without sponsorship that the reality hits: it's a design that sounds great in conference talks but fails basic operational requirements.
What's exhausting is watching the goalposts constantly move. When I show performance benchmarks, suddenly performance doesn't matter. When I demonstrate cost efficiency, the topic shifts to "separation of concerns." When I point out the need for caching, we're told CDNs solve everything. When CDNs are shown to be expensive Band-Aids, the argument becomes about implementation simplicity. I've even been told that fewer lines of code is a key metric - as if code golf determines operational viability. This isn't technical discourse; it's ideological defense through ever-shifting arguments.
Speaking of simplicity - setting up Sunlight for these benchmarks was remarkably complex. Manual seed generation, key management, undocumented YAML configurations, multiple executables with unclear relationships, and manual SQLite database initialization.
Yes, SQLite - the "static" CT implementation that supposedly doesn't need databases requires manually initializing one.
The irony of requiring database setup for an architecture championed for eliminating databases wasn't lost on me. How do we even backup this database? Can I safely delete it? Is it critical for operation? Apparently yes - the server breaks without it: "checkpoint missing from database but present in object storage". So much for database-free architecture. Where's the operating manual explaining any of this?
I had to manually parse the Go entry point code just to understand the correct startup sequence. When your "simple" system requires reading source code to figure out basic operations, you've failed at simplicity.
Even after getting it running, I didn't feel confident about the soundness of what I'd deployed. Was my seed secure enough? What entropy was expected? When I tried an empty file, it at least failed. Progress! But it happily accepted 32 spaces as a seed. Yes, I generated cryptographic keys using echo " " > seed. No warnings about low entropy, no validation, just silent acceptance of catastrophically insecure configuration. I didn't want this to work - I wanted it to fail informatively.
Actually, Sunlight requires you to provide a seed file with at least 32 bytes - but apparently any 32 bytes will do, including repeated spaces. This does indeed use whatever garbage you provide as the seed for key generation. A CT log with predictable keys undermines the entire Certificate Transparency ecosystem. This isn't just bad - it's "shut down everything and rotate all keys immediately" bad.
But hey, at least there's fsync, right?
The irony is breathtaking - being lectured about "robustness" and the critical importance of fsync while Sunlight silently accepts spaces as a cryptographic seed. Apparently, durably persisting compromised keys to disk is more important than ensuring those keys aren't trivially predictable. This perfectly encapsulates the misplaced priorities: obsessing over filesystem semantics while ignoring fundamental cryptographic security.
And why does it even require operators to provide a seed? Why not just generate a secure key pair automatically like every other cryptographic system built in the last decade? You know, for simplicity? Instead, we get the worst of both worlds - manual seed management with no validation.
When I attempted to run Sunlight at 50ms batching to match CompactLog's configuration, it essentially became stuck in a continuous checkpoint write loop - constantly PUT'ing new checkpoint objects to storage. With Sunlight's README recommending object versioning be enabled, these constant PUTs would generate thousands of object versions per hour, each incurring storage costs.
More concerningly, Sunlight's performance severely degraded after ingesting just a few hundred thousand certificates - batch processing times increased to 600ms (local minio was backed by a NVMe array), making 50ms batching physically impossible. This means Sunlight cannot safely operate at lower latencies even if operators wanted to provide better service to CAs. The 1-second batching isn't a conservative choice - it's an architectural limitation.
When I needed to run these benchmarks again on fresh infrastructure, my first thought was genuine dread: "Oh no, I didn't keep the old VM." That's not the reaction you want operators to have about your "simple" system. When redeployment feels like punishment, something has gone fundamentally wrong with your definition of simplicity.
If simplicity for operators was truly the goal, documentation and ease of deployment would be core to the project, not an afterthought. Instead, we see the opposite - a system that requires deep expertise just to start. This isn't simplicity; it's complexity with better marketing. With virtually no documentation, getting it running felt more like reverse engineering than deployment.
I suppose there's opportunity here - with enough expertise in these "simple" systems, one could build quite a consulting practice helping organizations navigate the complexity. The gap between "served from S3" marketing and operational reality certainly creates demand for specialists. But I'd rather build systems that operators can actually understand and run themselves.
I'm increasingly concerned that the CT ecosystem is being shaped by operators with nearly unlimited infrastructure budgets (whether through direct ownership or sponsorship), while the voices of smaller operators and monitors - who actually detect misissuance - are barely heard. When architectural decisions make logs unaffordable to operate independently, we're not improving transparency; we're consolidating control. This is becoming less about certificate transparency and more about cloud providers monetizing a mandatory security protocol.
The few independent operators still running logs deserve recognition for swimming against this tide. But we shouldn't design protocols that require either corporate sponsorship or six-figure monthly cloud bills to participate meaningfully in web security.
Happy to share the full benchmark methodology and scripts for reproduction.