
原始链接: https://news.ycombinator.com/item?id=40626014

在理想情况下,完全避免使用密码,并选择更安全的替代方案,例如某些平台支持的密钥、SSH 密钥、客户端证书或通过可信服务进行社交登录。 也可以使用魔术链接,但会带来风险,特别是由于潜在的键盘记录器而复制而不是单击时。 其次,尽可能使用双因素身份验证(2FA); 最好是通用双因素 (U2F),然后是基于时间的一次性密码 (TOTP),最后是短信验证。 建议非临时用户投资约 60 英镑购买一对用于密钥/U2F 使用的安全密钥,无需大量财务投资即可显着增强整体保护。 但是,请记住,即使存在安全飞地或硬件密钥也不能保证完全安全,免受涉及物理访问的攻击,因为本地存储的数据和应用程序可能会带来固有的风险。 此外,确保应用程序处于沙盒状态,并坚持从信誉良好的来源获取软件。 此外,采用各种做法,例如使用单独的计算机执行不同的任务、为开发人员做出贡献、避免盗版以及向创作者表示感谢,以减少潜在威胁。



Not surprised at all, ComfyUI extensions are just arbitrary python code. The first time I tried ComfyUI extensions I put it in a podman container with GPU passthrough and blocked network access.

Yep, it's super powerful.

I would say that the "more secure way" is to just use ComfyUI without installing any obscure nodes from unknown developers. You can do pretty much anything using just the default nodes and the big node packs.

Why does there seem to be such a disregard for security in deep learning?

There's examples like this post, but also, until recently, almost every deep learning model was literally distributed as a pickle file.

From my outsider perspective, it's a field that moves very fast, there seem to be new tools being released every week so:

1) As the developer if you focus on hardening, you might be too late to release.

2) People downloading shiny new libs/files/programs constantly.

3) Influx of people not that versed in the basics of computer security playing around with local LLM models, image generators, etc.

That seems like an almost exact duplicate of the NodeJS/NPM issues?

Those same points (but the NodeJS/NPM version of them) is a lot of why that ecosystem is having security and reputation issues as well.

Isn’t this just one of the milestones that’ll eventually happen? Blind panic due to security always occurs at some point. There must be a ‘law’ defined for this somewhere.

It's not specific to deep learning, practically every industry will look at security as a cost just not worth it. When we start throwing the CEO into jail instead of making them pay a 18.5M fine for losing the data of 41 million customers that's when things will change. Until then, it's just the cost of doing business.

Really? Throw a CEO in jail? This is just as crazy as the whole throw the supervisor in jail if the worker dies mantra in construction.

#1 users are responsible to look after their privacy. If you are using applications that don’t allow this - you need to reject the use of those applications.

#2 this needs to start happening in mass numbers. People need to rise up against these crazy corporate tech companies and their bull

I would love to live in a world where everyone did that. But that's (currently) a utopian pipe dream.

I don't know if throwing CEOs in jail is the answer, but neither is putting all the responsibility on people to make tough choices like "give up my privacy or fall out of touch with my friends" or "give up my privacy or give up the chance to get this job".

What about second firewalls ?

Hobbit jokes aside, yes, it pokes holes in the firewall on the machine hosting docker. It generally creates a lot of firewall rules to isolate or permit traffic to/from containers and expose ports.

Your "safest" bet is probably to only expose docker containers on the localhost interface, and use a reverse proxy (Nginx/Traefik/etc) to expose services. At least that's how i did it when i last ran Docker a few years ago.

Googled that, thanks for not providing clear references to your claims, and found that docker can crash Windows on boot, but not "brick" it. People are still able to safe boot, run system recovery/restore, or even reinstall Windows if they choose.

Besides, bricking software is impossible, bricking refers to physical devices unable to bootstrap anymore.

Not exactly. Hard brick is what you are referring to where you need to repair/reset the hardware OEM after corruption.

A soft brick is the actual reference here where you can easily recover from software/re-install.

Looks like a pretty small project. Only had 40 stars on GitHub before the repo was removed.

Was this the main method of GPT4 and Claude integrations for ComfyUI?

It was an extension for ComfyUI, which has 37k stars on GitHub. The way ComfyUI is commonly used is that a person shares a "workflow" file, which utilizes various obscure extensions (called "custom nodes") and then the people who want to run the workflow on their own computer will install all these obscure custom nodes that have like 40 stars on GitHub or so.

I have not seen a statement from Nullbulge so it's not appropriate to say that they took over the repo.

The author of the repo is claiming that their repo is hacked, but this is an obvious lie, because their very first GitHub commit is the one where they push the malware. Nobody would hack an empty GitHub account.

I don't know if the author of the repo is lying when they say that Nullbulge is behind the attack (perhaps the author is part of Nullbulge, perhaps not).

I wouldn't be so sure no one would hack an idle account. I had my Spotify account taken before I even used it. I think in my case they used my account to pump up other lesser known artists.

Okay, sure. But if we have an account which has never had any legitimate activity on it ever - an account that has only ever been used to push malware - then I don't know if it matters much who is the "rightful owner" of the account. Things would be different if the GitHub account had some legitimate activity before the "hack".

There was also an actively exploited XSS vulnerability on Github in the recent days.

Doesn't mean that this guy was not a malicious actor, only that one shouldn't be so quick to cast stones without evidence.

The person who created the custom node is the same person who "hacked" it. Whether or not the account is technically owned by some unrelated civilian is not important, because there is no other activity on the account.

Must be script kiddies. You have the opportunity to deploy anything to a machine that almost certainly has a powerful GPU, and choose a key logger that exists in signature databases? Genius.

Telegram and discord webhooks are 100% signs of an unsophisticated attacker and they are a very common sight in malware samples. Github is full of skiddie "info stealer" projects that use telegram api / discord webhook to deliver the stolen data. They make no sense to use since anybody can spam that webhook endpoint. Not 100% sure about discord, but at least in the case of telegram anybody can even read and download all the data that has been sent to it.

Something is fishy here.

According to the original report, the “key logger” was in the custom wheels in the requirements.txt, but looking at that repository there has been only two commits, which according to Reddit both had malicious code in them.

Of course, proper discovery would be easier if the GitHub account still existed.

That discussion on reddit really is something else so much misinformation and pretend knowledge at work. It's as scary as the malware.

I'm afraid a few simple tweaks, especially if the hackers themselves have access to the code LLM to try out their code, will be sufficient to evade detection.

what can be done to stop all this? We need some sort of OS level layer to validate these things. If we put a local LLM which checks the bytecode of things which are getting installed/running for security = will that solve all this? My heart goes out to those who must have lost their money due to this.

One basic measure (one part of a solution) would be to split Comfy into two parts: the part that does all the work (running plugins, generating images) should have access to nothing but read-only access to the files it needs, the GPU, and a socket to communicate with the other part.

Well, for one, the keylogger is detected by antivirus programs.

I keep coming across various projects whose executables trigger antivirus programs, and I think that when those triggers happen, "it's fine, don't worry" claims need to be treated with more skepticism.

At the same time, antivirus vendors need to stop being so lazy and using strings and such that are clearly part of an open source program/library for their signatures.

If you compile a benign binary yourself which has no malware, Chrome and Windows Defender will flag it as suspicious.

I was hacking on some open source stuff targeting win32, I posted some binaries on GitHub releases, I try to share with others... People tell me it's flagged as malware. It isn't malware. What do I tell them?

I hear code signing helps the heuristics to not get it flagged, but doesn't remove it.

If people working on said software want the warnings to be taken seriously, they should work on reducing false positives.

"keylogger" may not be the right term here? I'm not familiar with how that term is broadly used for, but my definition of that term is a tool that logs your keypresses. Here, it seems like it was scraping your chrome/firefox data for login cookies?

Honestly there's quite a lot of malware that go against those files, I wonder if there's a way to require high privilege to accessing chrome/firefox appdata, or just block it entirely from other apps.

Yeah you're right, people miss use the term keylogger frequently. These kind of malware are broadly called "stealers" and usually do not involve keylogging.

Actual keyloggers tend to be rare nowadays due to them being easier to detect and the fact that in general the browser data is a more valuable target.

Ideally, don't use passwords: Passkeys where supported, SSH Keys, client certificates, social login via a service that does support one of these methods.

Magic link emails can also work, but are potentially vulnerable if you copy/pasted it rather than clicking depending on the keylogger's capability and clipboard visibility, although the window for attack is small, it's a much more sophisticated attack that leaves more traces (good sites will reject reuse).

Second best, also use a second factor: U2F ideally, TOTP with the same caveats as magic link emails, and at the bottom of the barrel SMS which is better than nothing but known to be very flawed.

Honestly, if you are anything other than a casual user, and don't have devices with support baked in already, it's crazy not to spend ~£60 on a pair of security keys for passkey/U2F. It's not a lot of money and is just so much more secure.

Ideally, don't use passwords: Passkeys where supported, SSH Keys, client certificates, social login via a service that does support one of these methods.

If a process has the privileges to run as a keylogger, it can also grab your local SSH private keys and possibly harvest passwords and passkeys from your local password manager vault [1]. The process has local access and since it is a key logger presumably your master password. (The complexity depends a bit on the password manager, e.g. IIRC macOS keychain always requires a roundtrip through the secure enclave).

Honestly, if you are anything other than a casual user, and don't have devices with support baked in already, it's crazy not to spend ~£60 on a pair of security keys for passkey/U2F. It's not a lot of money and is just so much more secure.

100% this. A secure enclave or a hardware key is the only way to keep your key material safe.

Also, app sandboxing should be the default. macOS App Store Apps are sandboxed. Unfortunately, these days the standard is still for applications to have unfettered access to a user's files.

[1] Passkeys can also be on a security key, but e.g. Yubikeys only have a small number of resident key slots and I think passkeys to most people means key material synced through iCloud/1Password/your favorite cloud.

When I talk Passkeys, I definitely mean hardware by default, which is how most websites position it: it's normally described as "set up a passkey for this device" and in practice the vast majority of people using them will be using a fingerprint reader in a laptop or on their phone, because most people don't set up password managers with passkeys.

To me, using a software for passkeys is a hack only power users will do, and yes, I see it as a bad idea.

Right now I believe Yubikeys can do 25 passkeys, which is a pretty low limit, but it offers enough to protect your most important accounts, and right now I doubt many people have more than 25 sites they use that support passkeys (of course, hopefully that goes up quickly).

Aside from not using passwords or using 2FA, sandboxing helps.

A VM with GPU passthrough set up would be one example (although this is usually a pain to set up and I expect most people aren't doing it).

As a more user-friendly example, if you install an iOS app (local-model LLM and image generation apps exist), the sandboxing provided by the OS ought to be more than enough to prevent keyloggers, short of 0day exploits.

Are you giving it access to /dev/dri, or doing some fancier sandboxing?

(Would you even need anything fancier? I think /dev/dri is supposed to isolate users.)

I mean, anything with root access can very easily use libevdev to get all keystrokes as well as mouse positions. (It's maybe 10 lines of code to do that).

So, don't run stuff as root. If it needs root access, run it in a virtual machine (personally I use qubes os for this).

Most people should only download software from people they trust (to not be evil and also to be competent).

If you download code off some unknown person's GitHub repo, you'd be stupid not to read it very very carefully!

Not really, and it takes a few minutes because most of these packages (including npm) are small. You don’t have to read the WireGuard codebase because it’s reputable enough, but for obscure or unknown add-ons/package code, it’s on you to double-check, just like reading the ‘readme’.

This is why I refuse to use almost anything on npm. If you have a zero dependency project I'll consider it. If you have a dependency that also has a set of dependencies then I will never use your code.

I haven’t looked at the source code of a single npm package I’ve installed in the past 5 years.

“It takes a few minutes”

Dude my web dev projects have like 1,000s of dependencies. I’m not going to check the source code of every package tailwind requires.

Even if you did review it, a motivated attacker is not going to have an exfiltrate_user_data(). The xz backdoor exploit was incredibly sophisticated, and one key of the design was sneaking a "." into a single line of a build test script.

A cursory audit of primary dependencies has almost zero chance of catching anything but a brazen exploit.

Yeah. Realistically I think the best course of action is just assume you’re already using a library that can exfiltrate data.

This requires allowlisting egress traffic and possibly even architecting things to prevent any one library from seeing too many things. This approach can be a big pain though and could be difficult to implement practically.

Imo this makes no sense. There's zero chance you will start inspecting all dependencies even in a relatively small application, which now a days could pull already a large number of deps.

I don't see how doing any of this manually will help.

Everyone runs code they have not inspected. For example, almost no one has read all of the code of in FreeBSD, Linux (kernel), MacOS, Open BSD, or Windows. I also doubt people are reading all of the code in their favorite Linux distribution.

Even inspecting the code is not enough because a lot of security vulnerabilities are not obvious. Basically, security is hard, and often there are not a lot of good solutions.

Here are some tricks I have found which have helped me minimize my risk:

1) Use different machines for different purposes. Basically, you should not use 1 PC (or Mac) for everything. I have one for my finances, one for gaming, and a general-purpose PC. If one gets hacked, the others are still fine.

2) Get software from trustworthy sources. Most of the major software companies are not going to ship malicious code. For open-source software, use software from popular projects which have a good reputation.

3) Ask yourself why is someone providing this software? Is it for money? Are they creating it because they enjoy it? How do they support themselves? For example, Google's business model is building a dossier on people so it can deliver ads they are more likely to click on. When Google gives you something for "free", they will probably use it to track you, or track visitors to your website.

4) Support the people who build the software you use. If its commercial software, pay for it, do not pirate it. If it's open source, donate time or money to the projects you use. Also, thank the people who work on the software, and ALWAYS treat them with respect.

5) Avoid pirated software, software from "free" porn web sites, etc. People who provide illegal software, or sketchy software are probably willing to put back doors in it.

> For open-source software, use software from popular projects which have a good reputation.

On this topic, how much should a person trust central repositories of well-known operating system distributions (e.g. Arch, Debian)? I know only trusted people can upload to them, and the only time I've ever heard of malware slipping past them was XZ, but I don't know how much care they take.

Unfortunately, no, because the existence of LLMs that can automatically determine code that is suspicious will be offset by the existence of LLMs that can generate malicious code that bypasses the detection abilities of the aforementioned LLMs.

Perhaps we could just call these ALLMs (Adversarial Large Language Models). You’re already dropping the N in GAN, I see no need for the G.

As an end result I think someone clever could make a LLaMA pun for the name of a LLaMA based ALLM.

联系我们 contact @ memedata.com