Tell me if you’ve heard this one before.
You’re working on an application. Let’s call it “FooApp”. FooApp has a dependency on an open source library, let’s call it “LibBar”. You find a bug in LibBar that affects FooApp.
To envisage the best possible version of this scenario, let’s say you actively like LibBar, both technically and socially. You’ve contributed to it in the past. But this bug is causing production issues in FooApp today, and LibBar’s release schedule is quarterly. FooApp is your job; LibBar is (at best) your hobby. Blocking on the full upstream contribution cycle and waiting for a release is an absolute non-starter.
What do you do?
There are a few common reactions to this type of scenario, all of which are bad options.
I will enumerate them specifically here, because I suspect that some of them may resonate with many readers:
-
Find an alternative to LibBar, and switch to it.
This is a bad idea because a transition to a core infrastructure component could be extremely expensive.
-
Vendor LibBar into your codebase and fix your vendored version.
This is a bad idea because carrying this one fix now requires you to maintain all the tooling associated with a monorepo: you have to be able to start pulling in new versions from LibBar regularly, reconcile your changes even though you now have a separate version history on your imported version, and so on.
-
Monkey-patch LibBar to include your fix.
This is a bad idea because you are now extremely tightly coupled to a specific version of LibBar. By modifying LibBar internally like this, you’re inherently violating its compatibility contract, in a way which is going to be extremely difficult to test. You can test this change, of course, but as LibBar changes, you will need to replicate any relevant portions of its test suite (which may be its entire test suite) in FooApp. Lots of potential duplication of effort there.
-
Implement a workaround in your own code, rather than fixing it.
This is a bad idea because you are distorting the responsibility for correct behavior. LibBar is supposed to do LibBar’s job, and unless you have a full wrapper for it in your own codebase, other engineers (including “yourself, personally”) might later forget to go through the alternate, workaround codepath, and invoke the buggy LibBar behavior again in some new place.
-
Implement the fix upstream in LibBar anyway, because that’s the Right Thing To Do, and burn credibility with management while you anxiously wait for a release with the bug in production.
This is a bad idea because you are betraying your users — by allowing the buggy behavior to persist — for the workflow convenience of your dependency providers. Your users are probably giving you money, and trusting you with their data. This means you have both ethical and economic obligations to consider their interests.
As much as it’s nice to participate in the open source community and take on an appropriate level of burden to maintain the commons, this cannot sustainably be at the explicit expense of the population you serve directly.
Even if we only care about the open source maintainers here, there’s still a problem: as you are likely to come under immediate pressure to ship your changes, you will inevitably relay at least a bit of that stress to the maintainers. Even if you try to be exceedingly polite, the maintainers will know that you are coming under fire for not having shipped the fix yet, and are likely to feel an even greater burden of obligation to ship your code fast.
Much as it’s good to contribute the fix, it’s not great to put this on the maintainers.
The respective incentive structures of software development — specifically, of corporate application development and open source infrastructure development — make options 1-4 very common.
On the corporate / application side, these issues are:
-
it’s difficult for corporate developers to get clearance to spend even small amounts of their work hours on upstream open source projects, but clearance to spend time on the project they actually work on is implicit. If it takes 3 hours of wrangling with Legal and 3 hours of implementation work to fix the issue in LibBar, but 0 hours of wrangling with Legal and 40 hours of implementation work in FooApp, a FooApp developer will often perceive it as “easier” to fix the issue downstream.
-
it’s difficult for corporate developers to get clearance from management to spend even small amounts of money sponsoring upstream reviewers, so even if they can find the time to contribute the fix, chances are high that it will remain stuck in review unless they are personally well-integrated members of the LibBar development team already.
-
even assuming there’s zero pressure whatsoever to avoid open sourcing the upstream changes, there’s still the fact inherent to any development team that FooApp’s developers will be more familiar with FooApp’s codebase and development processes than they are with LibBar’s. It’s just easier to work there, even if all other things are equal.
-
systems for tracking risk from open source dependencies often lack visibility into vendoring, particularly if you’re doing a hybrid approach and only vendoring a few things to address work in progress, rather than a comprehensive and disciplined approach to a monorepo. If you fully absorb a vendored dependency and then modify it, Dependabot isn’t going to tell you that a new version is available any more, because it won’t be present in your dependency list. Organizationally this is bad of course but from the perspective of an individual developer this manifests mostly as fewer annoying emails.
But there are problems on the open source side as well. Those problems are all derived from one big issue: because we’re often working with relatively small sums of money, it’s hard for upstream open source developers to consume either money or patches from application developers. It’s nice to say that you should contribute money to your dependencies, and you absolutely should, but the cost-benefit function is discontinuous. Before a project reaches the fiscal threshold where it can be at least one person’s full-time job to worry about this stuff, there’s often no-one responsible in the first place. Developers will therefore gravitate to the issues that are either fun, or relevant to their own job.
These mutually-reinforcing incentive structures are a big reason that users of open source infrastructure, even teams who work at corporate users with zillions of dollars, don’t reliably contribute back.
The Answer We Want
All those options are bad. If we had a good option, what would it look like?
It is both practically necessary and morally required for you to have a way to temporarily rely on a modified version of an open source dependency, without permanently diverging.
Below, I will describe a desirable abstract workflow for achieving this goal.
Step 0: Report the Problem
Before you get started with any of these other steps, write up a clear description of the problem and report it to the project as an issue; specifically, in contrast to writing it up as a pull request. Describe the problem before submitting a solution.
You may not be able to wait for a volunteer-run open source project to respond to your request, but you should at least tell the project what you’re planning on doing.
If you don’t hear back from them at all, you will have at least made sure to comprehensively describe your issue and strategy beforehand, which will provide some clarity and focus to your changes.
If you do hear back from them, in the worst case scenario, you may discover that a hard fork will be necessary because they don’t consider your issue valid, but even that information will save you time, if you know it before you get started. In the best case, you may get a reply from the project telling you that you’ve misunderstood its functionality and that there is already a configuration parameter or usage pattern that will resolve your problems with no new code. But in all cases, you will benefit from early coordination on what needs fixing before you get to how to fix it.
Step 1: Source Code and CI Setup
Fork the source code for your upstream dependency to a writable location where it can live at least for the duration of this one bug-fix, and possibly for the duration of your application’s use of the dependency. After all, you might want to fix more than one bug in LibBar.
You want to have a place where you can put your edits, that will be version controlled and code reviewed according to your normal development process. This probably means you’ll need to have your own main branch that diverges from your upstream’s main branch.
Remember: you’re going to need to deploy this to your production, so testing gates that your upstream only applies to final releases of LibBar will need to be applied to every commit here.
Depending on your LibBar’s own development process, this may result in slightly
unusual configurations where, for example, your fixes are written against the
last LibBar release tag, rather than its current main; if the project has a branch-freshness requirement, you
might need two branches, one for your upstream PR (based on main) and one for
your own use (based on the release branch with your changes).
Ideally for projects with really good CI and a strong “keep main release-ready at all times” policy, you can deploy straight from a development branch, but it’s good to take a moment to consider this before you get started. It’s usually easier to rebase changes from an older HEAD onto a newer one than it is to go backwards.
Speaking of CI, you will want to have your own CI system. The fact that GitHub Actions has become a de-facto lingua franca of continuous integration means that this step may be quite simple, and your forked repo can just run its own instance.
Optional Bonus Step 1a: Artifact Management
If you have an in-house artifact repository, you should set that up for your dependency too, and upload your own build artifacts to it. You can often treat your modified dependency as an extension of your own source tree and install from a GitHub URL, but if you’ve already gone to the trouble of having an in-house package repository, you can pretend you’ve taken over maintenance of the upstream package temporarily (which you kind of have) and leverage those workflows for caching and build-time savings as you would with any other internal repo.
Step 2: Do The Fix
Now that you’ve got somewhere to edit LibBar’s code, you will want to actually fix the bug.
Step 2a: Local Filesystem Setup
Before you have a production version on your own deployed branch, you’ll want to test locally, which means having both repositories in a single integrated development environment.
At this point, you will want to have a local filesystem reference to your LibBar dependency, so that you can make real-time edits, without going through a slow cycle of pushing to a branch in your LibBar fork, pushing to a FooApp branch, and waiting for all of CI to run on both.
This is useful in both directions: as you prepare the FooApp branch that makes any necessary updates on that end, you’ll want to make sure that FooApp can exercise the LibBar fix in any integration tests. As you work on the LibBar fix itself, you’ll also want to be able to use FooApp to exercise the code and see if you’ve missed anything - and this, you wouldn’t get in CI, since LibBar can’t depend on FooApp itself.
In short, you want to be able to treat both projects as an integrated development environment, with support from your usual testing and debugging tools, just as much as you want your deployment output to be an integrated artifact.
Step 2b: Branch Setup for PR
However, for continuous integration to work, you will also need to have a remote resource reference of some kind from FooApp’s branch to LibBar. You will need 2 pull requests: the first to land your LibBar changes to your internal LibBar fork and make sure it’s passing its own tests, and then a second PR to switch your LibBar dependency from the public repository to your internal fork.
At this step it is very important to ensure that there is an issue filed on your own internal backlog to drop your LibBar fork. You do not want to lose track of this work; it is technical debt that must be addressed.
Until it’s addressed, automated tools like Dependabot will not be able to apply security updates to LibBar for you; you’re going to need to manually integrate every upstream change. This type of work is itself very easy to drop or lose track of, so you might just end up stuck on a vulnerable version.
Step 3: Deploy Internally
Now that you’re confident that the fix will work, and that your temporarily-internally-maintained version of LibBar isn’t going to break anything on your site, it’s time to deploy.
Some deployment heritage should help to provide some evidence that your fix is ready to land in LibBar, but at the next step, please remember that your production environment isn’t necessarily emblematic of that of all LibBar users.
Step 4: Propose Externally
You’ve got the fix, you’ve tested the fix, you’ve got the fix in your own production, you’ve told upstream you want to send them some changes. Now, it’s time to make the pull request.
You’re likely going to get some feedback on the PR, even if you think it’s already ready to go; as I said, despite having been proven in your production environment, you may get feedback about additional concerns from other users that you’ll need to address before LibBar’s maintainers can land it.
As you process the feedback, make sure that each new iteration of your branch gets re-deployed to your own production. It would be a huge bummer to go through all this trouble, and then end up unable to deploy the next publicly released version of LibBar within FooApp because you forgot to test that your responses to feedback still worked on your own environment.
Step 4a: Hurry Up And Wait
If you’re lucky, upstream will land your changes to LibBar. But, there’s still no release version available. Here, you’ll have to stay in a holding pattern until upstream can finalize the release on their end.
Depending on some particulars, it might make sense at this point to archive your internal LibBar repository and move your pinned release version to a git hash of the LibBar version where your fix landed, in their repository.
Before you do this, check in with the LibBar core team and make sure that they understand that’s what you’re doing and they don’t have any wacky workflows which may involve rebasing or eliding that commit as part of their release process.
Step 5: Unwind Everything
Finally, you eventually want to stop carrying any patches and move back to an official released version that integrates your fix.
You want to do this because this is what the upstream will expect when you are reporting bugs. Part of the benefit of using open source is benefiting from the collective work to do bug-fixes and such, so you don’t want to be stuck off on a pinned git hash that the developers do not support for anyone else.
As I said in step 2b, make sure to maintain a tracking task for doing this work, because leaving this sort of relatively easy-to-clean-up technical debt lying around is something that can potentially create a lot of aggravation for no particular benefit. Make sure to put your internal LibBar repository into an appropriate state at this point as well.
Up Next
This is part 1 of a 2-part series. In part 2, I will explore in depth how to
execute this workflow specifically for Python packages, using some popular
tools. I’ll discuss my own workflow, standards like PEP 517 and
pyproject.toml, and of course, by the popular demand that I just know will
come, uv.
Acknowledgments
Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor!