![]() |
|
![]() |
| A fork of a private repo is private. When you make the original repo public, the fork is still a private repo, but the commits can now be accessed by hash. |
![]() |
| Not through the GitHub interface, no. But you can copy all files in a repository and create a new repository. IIRC there's a way to retain the history via this process as well. |
![]() |
| Not defending GH here (their position is indefensible imo) but, as the article notes, they document these behaviors clearly and publicly:
https://docs.github.com/en/pull-requests/collaborating-with-... I don't think they're being underhanded exactly... they're just making a terrible decision. Quoting from the article: > The average user views the separation of private and public repositories as a security boundary, and understandably believes that any data located in a private repository cannot be accessed by public users. Unfortunately, as we documented above, that is not always true. Whatsmore, the act of deletion implies the destruction of data. As we saw above, deleting a repository or fork does not mean your commit data is actually deleted. |
![]() |
| It's a bug bounty, not a "only if we have time to fix it" bounty.
He found a security problem, they decided not to act on it, but it was still an acknowledged security problem |
![]() |
| I think a lot of developers and companies interpret "that's the way the code or process works" as intentional behavior, which is not always the case. |
![]() |
| Some might very well do. E.g. a company with a service for training hackers and security researchers.
In this case the question is moot, as this doesn't involve remote code execution. |
![]() |
| Shouldn't that be on the config page for the repo below the "private" button with a note saying private is not actually private if it's a fork? And ditto for delete? |
![]() |
| Don't blame the end user for doing something you don't want them to do if it is more convenient to do and works without immediate consequences. Redesign it or rethink your assumptions. |
![]() |
| Any time I hear “shouldn’t be done” I translate that to “will happen regularly”.
I do see this regularly in my work. All but one dev team I’ve worked with over the last few years has done this. |
![]() |
| As far as I can tell, they never run the garbage collector. Code I pushed to a fork that was deleted several years ago can still be accessed through the original parent repo. |
![]() |
| The expectations for AWS and public repository hosting are not the same. If you leaked something to a public GitHub repo you should assume that it has been cloned the second you pushed it. |
![]() |
| > Security disclosures are like giving someone an unsolicited gift.
Exactly. > The receiver is obligated to return the favor. Not at all. This is a very toxic expectation. |
![]() |
| I reported a different security issue to github, and they responded the same (although they ultimately ended up fixing it when I told them I was going to blog about the "intended behavior"). |
![]() |
| No, it really isn’t. Anyone who uses that word that way is just factually incorrect, and probably pretty irresponsible depending on the context. Software should not tell lies. |
![]() |
| We are missing the point here. The GP was claiming that delete meant something other than adding a mark to an item that you want to eventually be removed from the system. It doesn’t. |
![]() |
| > GDPR and other legislature
Nope. GDPR allows deleted data to be retained in backups so long as there is an expiration process in place. Doesn’t matter how long it is. But certainly nobody has a right to forcing a company to pull all of their backups from cold storage and trove through them all any time any deletion request takes place. That’d be the quickest path to Distributed Denial of Bank Account Funds imaginable. Even the GDPR isn’t that bone-headed. But yes, it is part of the law that the provider should tell you that your data isn’t actually being erased and instead it will be kept around until they get around to erasing everything as part of their standard timelines. But that knowledge doesn’t do anyone much good. > CNIL confirmed that you’ll have one month to answer to a removal request, and that you don’t need to delete a backup set in order to remove an individual from it. https://blog.quantum.com/2018/01/26/backup-administrators-th... |
![]() |
| But GitHub is keeping this stuff indefinitely. No long expiration, no probability of eventual disk overwriting, nothing. All they're doing is shutting the front door without shutting the side door. |
![]() |
| extremely annoying, but only true private option on somebody's else computer.
i read headlines like the above with the implied "not just to the employees there anymore" |
![]() |
| > I'll be calling "private" repos "unlisted"
That might be a bit too strict. I'd still expect my private repos (no forks involved) to be private, unless we discover another footnote in GH's docs in a few years ¯\_(ツ)_/¯ But I'll forget about using forks except for publicly contributing to public repos. > Users should never be expected to know these gotchas for a feature called "private". Yes, the principle of least astonishment[0] should apply to security as well. [0] https://en.wikipedia.org/wiki/Principle_of_least_astonishmen... |
![]() |
| > It's disappointing to see GitHub calling it a feature instead of a bug
git is a "distributed" version control software afterall. It means a peer can't control everything. |
![]() |
| > it is very natural that GitHub would take the "once public always public" line
I don’t think that follows at all. Purging hashes without a link to a commit/repository would be pretty natural. |
![]() |
| Hubber here (same username on github.com). We in GitHub's OSPO have been working on an open source GitHub App to address the use case where organizations want to keep a private mirror of an upstream public fork so they can review code and remove IP/secrets/keys that get committed and squash history before any of those changes are made public. Getting a beta release this week, in fact - check it out, I'm curious what yall think about the approach
https://github.com/github-community-projects/private-mirrors |
![]() |
| I was wondering, how they can otherwise comply with legislation. Makes sense there is a way to do this e.g. in case of valid GDPR, DMCA, etc. cases. |
![]() |
| > One can safely assume
With something as nuance as this, I wouldn't safely assume all processes, especially one from a compliance (none-technical) department account for it. |
![]() |
| >Can this be used to host illegal content?
It already is. Even to github org's own repos. Any time you make a PR, the /tree/ link to it stays valid forever, even if the repo author removes it. |
![]() |
| Is that a best practice in hindsight, or because it was known to some, that this issue exists, or for what other reason do you consider it a best practice? Git history? |
![]() |
| I worked in Professional Services at AWS for a little over three years. There was a fairly easy approval process to put our work out on the public AWS Samples (https://github.com/aws-samples) repository once we removed the private confidential part of the implementation.
I always started a new repository without git history. I can’t imagine trying to audit every single commit. |
![]() |
| According to https://docs.github.com/en/site-policy/content-removal-polic..., even an upstream dmca doesn’t suspend downstream by default (unless the copyright owner claims they believe all forks violate copyright) — so I would be surprised if downstream dmca suspended upstream.
NB: according to https://www.gtlaw.com/-/media/files/webinars/ian-ballon-may-..., page 4-470, it’s possible that failing to process a DMCA notice may only lead to losing safe harbor for the material identified in the notice, not for the entire service. So GitHub might just choose to ignore the notice for React, get sued, and win, all without losing the safe harbor. For less popular repos, I would not be surprised if you could take down any repo literally by submitting a completely bogus notice. But honestly I still don’t know how much leeway - legally - service providers have in applying their own technical/legal expertise when evaluating DMCA notices. I’d appreciate any sources (court decisions, textbooks, whitepapers, descriptions of actual industry practices, etc) on the topic. |
![]() |
| How is this more of a vulnerability than the existence of sites like archive.org is? Isn't it just a fact of the Internet that once you make something public, you can't fully take it back later? |
![]() |
| The title makes it seem more severe than it is. This only applies to GH forks of public repositories (or repositories that become public). Forks mirror the upstream repo's visibility. |
![]() |
| My understanding is that you are correct. If the repo and all of its forks stay private then the only people that would be able to view them are people who have permissions to access those repos. |
![]() |
| Truffle is practically famous for clickbait like this. They have a YouTube channel full of it. Their behavior in the security industry steered us far away from them as a vendor. |
Here is their full response from back then:
> Thanks for the submission! We have reviewed your report and validated your findings. After internally assessing the finding we have determined it is a known low risk issue. We may make this functionality more strict in the future, but don't have anything to announce now. As a result, this is not eligible for reward under the Bug Bounty program.
> GitHub stores the parent repository along with forks in a "repository network". It is a known behavior that objects from one network member are readable via other network members. Blobs and commits are stored together, while refs are stored separately for each fork. This shared storage model is what allows for pull requests between members of the same network. When a repository's visibility changes (Eg. public->private) we remove it from the network to prevent private commits/blobs from being readable via another network member.