（评论）

（评论）
(comments)

原始链接: https://news.ycombinator.com/item?id=40492264

标题：归档数字内容：优先考虑主要来源而不是网站归档本文讨论了一个名为 WWWOFFLE 的项目，该项目于 1996 年启动，旨在归档网站。尽管技术取得了进步，但通过洪流下载科学期刊和书籍等数字内容可能会更有价值，因为它们在训练人工智能模型方面发挥着作用。人们担心主要来源的潜在可访问性问题以及仅依赖人工智能进行研究的兴起。提供的命令使用 wget 镜像网站，在处理各种格式和设置的同时优化速度和效率。用户分享了他们使用网站镜像的经验，包括处理IP禁令和存储方法。他们主张将镜像文件打包到 zip 存档中，以实现高效的备份解决方案。

This is why I’ve gotten into the habit of maintaining my own WWW archive of sites I find interesting. Probably have around 1 TiB now, and One Of These Days I’d like to set my network up so it can serve arbitrary sites directly from local archive to revive any site I want.

I have a `wget-mirror` shell function invoking wget with all the trimmings that takes care of 99% of sites. I’ll edit the full command into this comment when I get home if anybody else wants to start doing the same :)

Assume everyone is familiar with this project, dating back to 1996:

https://en.wikipedia.org/wiki/WWWOFFLE

https://ftp.netbsd.org/pub/pkgsrc/distfiles/wwwoffle-2.9j.tg...

The way the www is going, it seems like downloading a copy of libgen, i.e., nonfiction books, and scimag, i.e., academic journals, via torrent, would be more valuable than archiving websites, in general. These primary sources are part of the material used to train so-called "AI" anyway. The problem is that this so-called "AI" also includes all the garbage from the www.

Worst case is eventually these books and journals will again become publicly inaccessible but "AI" will be offered as a bogus substitute; a future where few people will do research using primary materials anymore, they will just submit questions to a remote "AI" server. Truth will be decimated.

Same. I've been scraping PDF'ed magazines, etc. and keeping them locally. In addition to feeding my byte-hoarding tendencies, I like the idea I could be off-grid in my van/RV somewhere and reading a "Popular Electronics" magazine from 1972 on my laptop.

(Oh, never mind YouTube videos that I once added to playlists ... that later disappear leaving only holes in my playlists.)

My problem with this approach is that the stuff I want to look at in 10 yrs time is never the stuff I think of saving right now. In the 2000s there were browser extensions I've forgotten the names of (shelf? slogger?) that would automatically save local copies of every webpage on page load. But I don't think they're around anymore and have no idea how you could achieve similar functionality with dynamic pages anyway.

> But I don't think they're around anymore and have no idea how you could achieve similar functionality with dynamic pages anyway.

It is probably easiest to save the render as a picture and then store text separately for searchability?

how is this a solution? The Archive performs a valuable service. They're collecting wahy more of the internet than you are (I assume) so when that thing you didn't back up today is not available in 10yrs it's more likely to be on the archive.

I donate to The Archive. More people should too.

I don't know why you're treating them as mutually exclusive. Single points of failure are as bad when it comes to organizations as they are with anything else. Internet Archive (the org) could stop existing with the flick of a pen. I don't think “Let Somebody Else Do It” is a healthy attitude to take, and I'm going to keep doing what I'm doing.

Plus for as great of a service as Wayback Machine is, it can be very unpleasant to actually browse. I dislike how it injects its own toolbar into every page (yes I know how to massage the URLs to get the raw page data, but it isn't browsable that way). Have you never encountered sites in Wayback Machine where certain pages were just randomly not archived? Or when you click a link and get a page from years earlier or later than the one you came from? Never encountered a page or an entire domain that was blocked from Wayback Machine? Why do you think I would get started doing something like this in the first place if I didn't find it more fun to browse my own archives than Somebody Else's?

> I’ll edit the full command into this comment when I get home if anybody else wants to start doing the same :)

I would love that. I have a little for parameter version, but I feel yours is more tried and true.

Missed the edit window, but here's the command I use. Newlines added here for clarity.

  wget-mirror() {
    wget --mirror --convert-links --adjust-extension --page-requisites \
    --no-parent --content-disposition --content-on-error \
    --header="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" \
    --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:129.0) Gecko/20100101 Firefox/129.0" \
    --restrict-file-names="windows,nocontrol" -e robots=off --no-check-certificate \
    --no-hsts --retry-connrefused --retry-on-host-error --reject-regex=".*\/\/\/.*" $1
  }

Some notes:

— This command hits servers as fast as possible. Not sorry. I have encountered a very small number of sites-I-care-to-mirror that have any sort of mitigation for this. The only site I'm IP banned from right now is http://elm-chan.org/ and that's just because I haven't cared to power-cycle my ISP box or bother with VPN. If you want to be a better neighbor than me, look into wget's `--wait`/`--waitretry`/`--random-wait`.

— The only part of this I'm actively unhappy with is the fixed version number in my fake User-Agent string. I go in and increment it to whatever version's current every once in a while. I am tempted to try automating it with an additional call to `date` assuming a six-week major-version cadence.

— The `--reject-regex` is a hack to work around lots of CMS I've encountered where it's possible to build up links with an infinite number of path separators, e.g. an `www.example.com///whatever` containing a link to `www.example.com////whatever` containing a link to…

— I am using wget1 aka wget. There is a wget2 project, but last time I looked into it wget2 did not support something I needed. I don't remember what that something was lol

— I have avoided WARC because I usually prefer the ergonomics of having separate files and because WARC seems more focused on use cases where one does multiple archives over time (as is the case for Wayback Machine or a search engine) where my archiving style is more one-and-done. I don't tend to back up sites that are actively changing/maintained.

— However I do like to wrap my mirrored files in a store-only Zip archive when there are a great number of mostly-identical pages, like for web forums. I back up to a ZFS dataset with ZSTD compression, and the space savings can be quite substantial for certain sites. A TAR compresses just as well, but a `zip -0` will have a central directory that makes it much easier to browse later.

Here is an example of the file usage for http://preserve.mactech.com with separate files vs plain TAR vs DEFLATE Zip archive vs store-only Zip archive. These are all on the same ZSTD-compressed dataset and the DEFLATE example is here to show why one would want store-only when fs-level compression is enabled.

  982M    preserve.mactech.com.deflate.zip
  408M    preserve.mactech.com.store.zip
  410M    preserve.mactech.com.tar
  3.8G    preserve.mactech.com

Also I lied and don't have a full TiB yet ;)

  [lammy@popola#WWW] zfs list spinthedisc/Backups/WWW
  NAME                      USED  AVAIL     REFER  MOUNTPOINT
  spinthedisc/Backups/WWW   772G   299G      772G  /spinthedisc/Backups/WWW


  [lammy@popola#WWW] zfs get compression spinthedisc/Backups/WWW
  NAME                     PROPERTY     VALUE           SOURCE
  spinthedisc/Backups/WWW  compression  zstd            local



  [lammy@popola#WWW] ls 
  Academic                        DIY                             Medicine                        SA
  Animals                         Doujin                          Military                        Science
  Anime                           Electronics                     most_wanted.txt                 Space
  Appliance                       Fantasy                         Movies                          Sports
  Architecture                    Food                            Music                           Survivalism
  Art                             Games                           Personal                        Theology
  Books                           History                         Philosophy                      too_big_for_old_hdds.txt
  Business                        Hobby                           Photography                     Toys
  Cars                            Humor                           Politics                        Transportation
  Cartoons                        Kids                            Publications                    Travel
  Celebrity                       LGBT                            Radio                           Webcomics
  Communities                     Literature                      Railroad
  Computers                       Media                           README.txt

Some of this could stand to be re-organized. Since I've gotten more into it I've gotten better at anticipating an ideal directory depth/specificity at archive time instead of trying to come back to them later. Like `DIY` (i.e. home improvement) there should go into `Hobby` which did not exist at the time, `SA` (SomethingAwful) should go into `Communities` which did not exist at the time, `Cars` into `Transportation`, etc.

`Personal` is the directory that's been hardest to sort because personal sites are one of my fav things to back up but also one of the hardest things to try and organize when they reflect diverse interests. For now I've settled on a hybrid approach. If a site is geared toward one particular interest or subsulture, it gets sorted into `Personal/`, like `Academics`, `Authors`, `Artists`, `Goth` (loads of '90s goths had web pages for some reason). Sites reflecting The Style At The Time might get sorted into `1990s` for a blinking-construction-GIF Tripod/Angelfire site or `2000s` for an early blog. Some times I sort personal sites by generation like `GenX` or `Boomer` (said in a loving way — Boomers did nothing wrong) when they reflect interests more typical of one particular generation.

Maybe save the log automatically? And then check and report unsolved errors, at end of the fuction or better separate one so log can be reinspected any time.

I have encountered "GnuTLS: The TLS connection was non-properly terminated. Unable to establish SSL connection." multiple times, and retry options seem to be useless when that happens. Some searches suggest it could be related to tls handshake fragmentation, but nonetheless wget could retry if related options are used. Manual retry seems to download the missing URLs, otherwise mirroring jobs are randomly incomplete.

Why, what's the point in doing such nonsense? Unless it's someone with lots of money, contacts in the dark web, and some historic Barbara Streisand type chip on the shoulder.

DDOS attacks are dirt cheap and can be contracted from large professional sites offering customer support and the works. The largest one taken down had hundreds of thousands of users, and had carried out some 4 million attacks, for prices starting at $14.99/month. [1]

So in other words, anybody can carry out a DDOS for basically no cost. So trying to analyze the purpose, let alone suspects, is probably not going to be fruitful.

[1] - https://wccftech.com/865619-2/

That is exactly the problem. These services are constantly at war with each other and are attacked by competitors. Cloudflare provides DDoS protection to the DDoS providers so they can keep their services online, which directly benefits Cloudflare by DDoS being a bigger problem than if they were all busy attacking each other.

This is a sampling of currently available services and who they use for DDoS protection:

  stresslab.app - Cloudflare
  maxstresser.com - Cloudflare
  sunnystress.com - Cloudflare
  tresser.io - Cloudflare
  ip-stresser.net - Cloudflare
  hardstresser.com - DDoSGuard
  zdstresser.net - Cloudflare
  starkstresser.net - Cloudflare
  stresserhub.org - Cloudflare
  nightmarestresser.net - DDoSGuard

Just for fun head over to Cloudflare's abuse reporting site and try to figure out how to get one of these taken down. https://abuse.cloudflare.com/

DDoSGuard has a reputation for being The Crime CDN, disproportionately serving things like phishing campaigns, black hat forums, piracy sites, etc, so the fact that they are merely the second most popular CDN amongst DDOS providers after Cloudflare speaks volumes.

TIL. thats shocking. i doubt it’s intentional but “institutions will preserve the problem to which they are the solution. no need to ascribe to malice that which can be blamed on simple incentives (and of course its a big problem, things fall thru the cracks, etc etc)

I find the idea of DDoS providers confusing. If someone tried to operate a service that can be abused easily to cause similar disruption in the physical world, the operation would be taken down quickly and the people behind it would probably end up in prison. But somehow the internet is still a lawless zone where crime is tolerated and everyone is out for themselves.

It used to be very rare for DDoS providers to publicly advertise their services, you kinda had to know a guy who knew a guy. If you put up a website offering this service the Good Guys of the Internet would track you down and get your provider to take you down, or that provider would in turn get disconnected from the internet.

Now they hide behind Cloudflare who will refuse to turn over any information so that security folks can get them taken down. Unfortunately Cloudflare has grown too large that we can't just block all of it or depeer them like we would any other network that provided services to bad actors.

That kind of vigilante justice is part of the general lawlessness.

Most of the listed domain names are under US jurisdiction. That means the authorities should be able to take them down. If Cloudflare is found to have been knowingly enabling crime, it could face fines, and the CEO and other key people could end up in prison. The Cloudflare services have probably been paid using means that are under US jurisdiction. Those payment accounts can be closed and the people behind them tracked down and potentially charged with crimes.

Or at least that's how things work in the real world. The internet is still apparently too new for the authorities to understand how to deal with it.

The added element of international relations makes it a far bit more tricky than any real-world equivalent. Usually these operate out of places that are not on good relations with the countries they target. Russia and China are the big ones.

It's obvious why a DDOS provider would want to use Cloudflare, but their point is that Cloudflare turns a blind eye to DDOS providers using their services. Actively helping to keep DDOS providers online while also selling DDOS mitigation isn't a good look to say the least.

Cloudflare is a data goldmine setup by people who love fedoras and newspapers. Professional DDOS providers won't use Cloudflare ever and have the skills, metal and (human) network to do everything in-house.

Yeah you're actually worse off using Cloudflare because you can't block attacker IPs anymore, once you're dependent on them to protect you, and they're not very good at protecting. I run an online service that invites hackers to DDOS the server. Cloudflare's servers would usually go down before we did. The only way we could stay online was by switching to GCS and using token buckets to blackhole IPs in the raw prerouting table, which made the hostile packets into mighty Google's problem. Thankfully they don't charge for ingress, so it was about as cheap as Cloudflare too.

While I think there might be valid arguments to support that claim, that blog post hardly qualifies. The author runs a gambling site and while the way Cloudflare handled the situation (according to the author) could certainly be improved, they clearly were affecting other users by "tainting" shared IPs.

And yet if they ponied up the money, that issue of "tainting" shared IPs suddenly goes away. You can bet CloudFlare would graciously give the gambling site as much time as they need to bring their own IP (they went out of their way to link third party sellers of IPs with dubious provenance, after all).

Did you read the blog post? It doesn't include the entire correspondence so it's not clear how explicit Cloudflare was about this but the Enterprise plan they were trying to upsell them includes BYOIP. It's clear to me that Cloudflare insisted they buy the enterprise plan because it includes BYOIP.

So in other words, Cloudflare noticed the author was running a gambling site, they decided that this was negatively impacting the shared IPs and the author would therefore need to upgrade to a plan that included BYOIP because they would need to use that feature to continue using Cloudflare and they likely insisted on prepayment for the annual plan because gambling sites have a reputation for being flaky and prepaying would have demonstrated the liquidity necessary to continue operating the site at that plan.

Again, Cloudflare could have communicated this better (and maybe they did in parts of the correspondence the author didn't share) but this all seems perfectly understandable, especially given how the sales team kept referencing Trust and Safety (implying the alternative is ending the contract for violating the ToS).

The issue of tainting shared IPs would indeed have suddenly gone away had the author brought their own IP (which would have required an Enterprise plan to do while staying on Cloudflare). Instead the author feigns ignorance arguing they don't even need the features of the Enterprise plan and doesn't acknowledge the issue with sharing IPs while sheepishly mentioning that maybe they're accidentally invading bans of their domain in certain countries by having alternative domains which they of course don't actually need because most traffic comes from their main domain yet somehow having these alternative domains is critical to running their business.

What are you even trying to argue here? The author is being deliberately dishonest in how they frame the incident and Cloudflare's motivation is perfectly understandable. The only thing to take offense with is the communication style which we can only judge based on a select few messages the author shows us. We have to rely on their word after they have already demonstrated dishonesty.

IIUC, that's always a good theory for unexplained DDoS. Though, even if they have only profit motivations, I'm a little surprised when they don't seem to let ideology influence their selection of targets for demos.

For the sake of argument (maybe not true), let's say that all techies are aware of archive.org, and consider it beneficial, probably using it themselves.

Why don't they instead demo against a target that will be proof of capability, and one that someone won't pay them to do (no freebies), yet one that they perceive as bad or deserving in some way?

Probably improper to suggest "better" targets here, but I really wonder what's going on when some relative do-gooder gets attacked.

Similarly, ransomware attack on a children's hospital, of all places? Doesn't that get you uninvited to criminal mastermind dinner parties?

As Omar of "The Wire" told us, a man's gotta have a code.

One thing to keep in mind m about LockBit ransomeware was it was SaaS — errr RaaS — and there is a good chance the target was picked by an insider there, or it was at least some opportunistic hacker not really associated with those who provided the service, besides signing up as an affiliate.

LockBit was so successful partly because they didn’t have to hack anyone themselves. It was basically something advertised “Got SSH or RDP access? Let’s make a bunch of money.”

This attracted hackers who might not trust themselves to do the extortion part safely, as well as people who didn’t actually hack anything but hated their boss, wanted a payday.

Perhaps cruelty is the point.

Perhaps they intentionally attack targets that are generally seen in a positive light, to prove to potential customers that morale is not an issue.

Oh, you want me to DDOS a children's hospital? No problem.

What's the significance of that?

(Googling "Jason Scott TIA" gives me "Dr Jason Scott is a Senior Research Fellow in the Tasmanian Institute of Agriculture" which doesn't explain much to me)

The beauty of acronyms/initialisms that people are too lazy to spell out!

TIA = The Internet Archive (i.e. the victim of the DDoS).

>The user you're responding to is Jason Scott of The Internet Archive

The Tasmanian Institute of Agriculture are well known for their work on biological models of computer security architecture.

I am shocked that any HN reader could be ignorant of this fact. Their director is a (controversial) Turing Award winner.

There are some very bad very shitty people about, just trying to make earth worse.

Npm has been under pretty severe attack for ~6 weeks now. I forget who else.

The scariest thing to me is what we might do in the face of persistent online attacks. If this stuff gets rolled up into western nations rolling back privacy & liberty? That's an theonion.com "bin laden plan to sit back and enjoy collapse" situation. Freak out & let cyber security paranoia reign & destroy free communication & connection.

Are you upset? Can't do nothing about it? It even made a headline or even just a thread on a forum? That's reason enough for some. It could easily be a teenager with no better excuse than not having a fully developed brain and no better reason than liking to ruin things. Having seen how much that happens, I guess it's more likely than a conspiracy or a crime with any rationality behind it.

Seriously? People do this shit for fun. There used to be a program (LOIC) popular on 4chan used for DDoS attacks all the time, it's the origin of the "firin mah lazer" meme.

Cloudflare is not "an extortionist firm". It is a large tech company, where occasionally teams employ shitty sales tactics to meet their numbers, but generally provides a valuable service and acts reasonably ethically.

There are open source tools to mitigate DDoS, but all of them will have some marginal cost to run, and they will all be significantly worse than Cloudflare as they benefit from neither Cloudflare's data moat or scale.

> Secondly, considering Cloudflare would MITM all traffic, it would make a data good source for the NSA, thereby violating all user privacy.

This seems like a weak argument. Should we just take down anything widely accessed because it might be used by the NSA? What about AWS?

Plenty of DDoS mitigation firms use open source tech, but that’s only step one of mitigation, most normal firms will never be able to stop a DDoS attack without someone else with a lot of resources tanks the attack for you.

Even if you go all out and buy a bunch of huge IP transit links, you are not gonna be able to stop the IXP 800 miles away from getting congested and blocking your customers from accessing your site anyways. You need access to a backbone to route traffic differently to avoid those kinds of issues, which is why DDoS scrubbing services will partner with a T1 ISP to do most of the work.

I have asked this a few times and never gotten an answer beyond "One day they could turn evil." What is the reason Cloudflare is an extortionist firm? I am way more concerned about Amazon than Cloudflare.

Beyond the upselling under duress, I've also seen complaints that Cloudflare protects the client-facing websites of DDoS-as-service operators. This enables them to sell their service, which then creates demand for Cloudflare's service from their targets.

Cloudflare describes that policy as a commitment to content neutrality rather than extortion, and I think that's more or less sincere (since they've protected many other unpopular sites that didn't give them such a benefit, with a few high-profile exceptions). It does work out very conveniently for them, though.

> Cloudflare describes that policy as a commitment to content neutrality rather than extortion

But we know that's not true. Point out problems with a very controversial blogger and they'll cancel your service.

The only universal way IMHO, is to associate a small cost to every internet request. The cost has to be as small as possible, but it has to be there for million/billion/trillion requests to add up the cost and make it uneconomical to continue the attack past some point.

It is the same problem with email spam. What's stopping someone from sending billions of spam mails?

If we suppose that: a blockchain exists which is fast enough, cheap enough, and spread out enough on the globe (to mitigate latency), then there is no reason, for a tcp packet to not carry with it a small money transaction, in the order of a millionth of a cent. Information gets served back, only when the transaction is confirmed.

In that way, any request with no transaction gets discarded, and only requests with a small cost pass through. Suddenly by sending requests one after another and no end in sight, DDoS attacks and mail spam start to cost money. It is the serving of request that makes DDoS attacks and mail spam to be effective.

The problem however, is that no blockchain is fast enough and cheap enough as of today. But there will be one in a handful of years.

Similar systems were proposed in the late '90s/early 2000s (hashcash/micropayments) to combat spam. The big problem isn't a technological one, it's that it presupposes some "sweet spot" price (negligible for legitimate users, yet prohibitive for abusers) that has never been shown to exist in reality.

(also, you're arguably just moving the problem to DDoSing the payment processing / firewall mechanism)

> Similar systems were proposed in the late '90s/early 2000s (hashcash/micropayments) to combat spam.

These ideas indeed exist for decades.

> The big problem isn't a technological one, it's that it presupposes some "sweet spot" price (negligible for legitimate users, yet prohibitive for abusers) that has never been shown to exist in reality.

Advances in technology, software and hardware, make it easier and easier for that sweet spot to exist. That sweet spot, didn't exist in the past, certainly, but we are close right now.

One example that i think is useful here, is aluminum cans for fizzy drinks. Aluminum, a strong metal compared to cardboard or plastic or glass, is better at withstanding pressurized gases without exploding. The downside, is that it's more expensive. When manufacturing prices dropped down a lot, then it was feasible to drink half a liter of liquid and just throw away the metal. Aluminum still not free though, but the small price did worth it. Huge waste of energy as well to smelt all that metal and throw it away after 10 minutes of drinking, but it is economically viable.

One could manufacture Titanium cans, and drink even more fizzy drinks. But that's not economically viable as of today.

> you're arguably just moving the problem to DDoSing the payment processing / firewall mechanism.

Yes, the problem is moved elsewhere, that's the weak link in the scheme i described. The thing is that a flood of transactions still costs money. Blockchains cannot be flooded just with requests, they have to be flooded by transactions. Take a look at the article [1] which outlines some ideas. I don't agree with a lot of things in there, but it states the problem and gives some numbers.

The theory when it comes to blockchain deterring DDoS attacks (and other kind of attacks), is that there are not bad guys in general, just rational economic actors who use dirty tricks. When a dirty trick starts to cost money, and profit disappears from an attack, then the rational economic actor will stop the attack. The bad guy will resume the attack regardless of profit, but that's one of the axioms of the theory, that there are no bad guys.

[1] https://www.dlnews.com/articles/defi/ddos-attacks-are-an-inc...

This wouldn’t fix anything. Most DDoS attacks today are amplification attacks, e.g. “I sent 10 bytes to this unpatched NTP server and as a result it sends 500 kb to this target server., so in your scheme the costs would not be borne by the attacker.

It's not (just) a case of the software, it's the hardware and the position in the network. These DDoS primarily just saturate your internet pipe: you need to be able to co-ordinate with core ISPs to block the ddos traffic before it concentrates too much.

If your application can take it, drop it in the application. If your load balancers can take it, drop it on your load balancers. Otherwise you have to get your provider to drop it, if they can take it. Worse case they'll drop all traffic meant for you to protect the rest of their network.

How much $$ does archive.org spend on infra and such? How much does one need to endure the most damaging DDOS? I remember seeing from somewhere that Google went through some huge DDOS attack without going down.

Given its benefit to the lay persons I recommend everyone who use their services give a small amount once for a while. I already did so but if not for family issues I'd donate way more.

DDoS attacks are usually volumetric attack. Send more bits than the pipe the website has to the internet.

To combat this you need to buy enough pipes to the internet for your regular internet traffic, as well as an extra 500 Gbps or so. That is a lot of unused bandwidth to be paying for every month. Then once the packets arrive at your datacenter you still need to buy dedicated appliances to scrub out the bad and let the good flow.

Google is constantly under attack, but their normal daily traffic volume (multiple Tbps) is large enough that just the extra capacity they keep on hand to deal with traffic spikes the World Cup or a popular YouTube video is larger than what most attackers can muster.

Yes, that is part of it. They tell potential clients, "On May 27th, I'm going to take down the Internet Archive". Then they do it, and then go back to their clients and say, "now that you've seen my work, do you want to pay me?".

Making the news is the goal. As long as a few people can verify it was you, word will get around about the person who can take down big targets, and will cite the news articles as part of the proof.

Anyone who doesn't like the availability and accessibility of history and documents.

Lots of people want to rewrite or erase history.

Quoting a story I wrote about this a few years ago:

"Everything you speak, all ideas, all things, all thoughts, they are all of the past. Society and knowledge is a composite of the shadows of former presents.

When people lie or misrepresent knowledge they speak of a past they wish to change.

What if people who have the most to gain from deceit had a tool to actually change the past and make these lies the truth?"

Here it is if you're curious https://kristopolous.medium.com/stephen-hawking-had-a-time-t...

let me go further: the whole of the copyrighted industry

including all media conglomerates (obviously) and all scientific, literary, etc, publishing houses.

also, there's a global war, so it well may be a fog-of-war technique or like somebody else also mentioned: someone needs something to stay quite for a little bit as part of some larger operation

The establishment always gets the most advanced technology, attack and defense because they have the big bucks. That's why I never believed that technological advancement promotes individualism, distributed X (whatever X is, money, power, whatever). Eventually it always points to a more centralized world because the elites are able to control more with each technological advancement.

Doubt this is coordinated - more likely a singular (m/b)illionaire wanted a post/photo/video, or multiple of, deleted for good, perhaps for suppression of legal evidence, and this was one way of bringing some firepower to a… library. One of the internets biggest libraries too. Odd.

That's a tough question to answer without devolving into politics, which is off topic for Hacker News.

I think that's also the wrong question to ask. "Who's doing it?" is less interesting than "What's enabling them to succeed?"

> Sorry to say, archive.org is under a ddos attack. The data is not affected, but most services are unavailable. We are working on it & will post updates in comments.

Botnets usually, sometimes amplification attacks against NTP or DNS, although the Chinese government’s Great Firewall also has offensive capabilities known as the Great Cannon, although they are generally used against GitHub because it hosts censorship-circumvention software like VPNs.

Resistance to that kind of simple countermeasure is exactly what distinguishes a DDoS attack from a non-distributed DoS attack. The traffic basically comes from "everywhere". Not literally every IP block and route, but widespread enough that it's difficult to separate from legitimate users without actually processing the traffic (which is what you're trying to avoid by e.g. blocking an IP range).

Yes. This is why it's maddening that most people don't take computer security seriously. Virus infected devices are what give these botnets their scale and wide distribution.

There are many individuals with embarassing things on the Internet. Perhaps one of the recent University empcampment participants wants to get an internship and doesn't want prospective employers to see articles about them them screaming and yelling. I think the odds that a major publisher is behind this is slim....

（评论） (comments)

（评论）
(comments)