HackEurope is over. In many ways, it was a complete shitshow. But now that the caffeine overdose and sleep deprivation is over, I can say that there were actually some important lessons.
TL;DR:
-
Front-end is almost everything. There is 0 burden of proof that your project is actually functional or that it has any practical application. As long as it looks cool, investors and non-technical people will eat that up.
-
Choose your track wisely. Make sure that the track sponsor IS ACTUALLY AT YOUR FUCKING LOCATION. Most people were under the impression that tracks were per-country when in fact there was a single €1000 prize shared across the 3 countries and the sponsor wasn't actually operating in some.
-
Choose a problem that is easy to explain. There were 2 minutes to explain. It is a losing game whether or not you explain context. Non-technical people will tune out confused regardless. We were extremely lucky with 2/3 of the evaluators actually knowing about open-source supply chain attacks and being excited about our solution.
-
Follow the trends. All winners had "AI" as a significant part of their solution.
That being said, I personally wouldn't follow my own advice. I went in with the goal of building something that I would want to maintain long term. Not just AI slop (I fucking hate Lovable).
So what did we actually build?
Context
Over the past year, we've had all sort of supply chain attacks. From the Shai-Hulud worm to Notepad++ being hacked. Developers are the most vulnerable. Most people install packages with no verification whatsoever. Meanwhile, $BIGCORPs hire expensive security teams with manual reviews that take forever. Lots of time wasted and duplicate work done between companies.
(Common misunderstandings: No, we're not looking at CVEs or vulnerabilities. Lots of companies like Snyk or Wiz already do that. There are valid times to use insecure but non-malicious software such as for internal tooling)
The MVP
So what our MVP was is basically a secure package registry that you simply
npm config set and used in place of NPM. We take packages from NPM and
generate a series of tests that would usually trigger malicious behaviors (if
any). We then collect a bunch of behavioral data using eBPF (like file accesses,
DNS, network connections, executed commands, etc.). This is a lot of data, and
we increase the signal to noise ratio by deduplicating based on a known set of
safe behavior collected from another real package. From that, we can either use
"AI" (of course we had to plug that in somewhere. It was the theme of the
hackathon lol) or historical data to determine whether that behavior is
malicious or at least anomalous. If everything is clear, it gets uploaded to our
"secure" registry.
There is still a lot to work on. There are quite a few features we're also working on:
- Reproducible builds
- Derivative of behavioral changes across time to determine the "normal" amount of deviance
- Supporting PyPi, Maven, Cargo, and other ecosystems
- Automatic tracing of behavior to source (line of code, commit introduced, etc. Reverse engineering if necessary).
- Matching registry releases to exact source code commits
- Use
eCaptureto decrypt HTTPS - Have honeypot data to catch exfiltration attempts
I've been working on this since last September at a slow burn (no code reused for HackEurope though) and the goal is to have a running startup by May.
If you have any comments, or just interested in general, pop me an email.
AI encourages conformity and kills creativity
A solid 90% of the projects there were just vibe coded slop. Even the ideas were AI. You can tell when multiple people implemented the exact same idea with the exact same title, description, and implementation.
While people call me a luddite, I do not particularly hate AI as a tool. My problem is that it has significantly lowered the bar for certain project types and therefore incentivize people who would have otherwise built something cool to instead fit into a mold constrained by the capabilities of AI.
A lot of cool ideas are out of distribution from the training data, and those rarely show up at hackathons anymore. The AI says they're "too hard" and people simply avoid these.
There's a lot more I want to write here but I'm getting on a flight soon. Will get back when I have the time.
I'm sure that this post will cost me some future jobs or whatever but I don't really give a fuck. Those places probably ain't worth it.