Vesuvius Challenge 2023 Grand Prize awarded: we can read the first scroll

ImageXav · 2024-02-05T20:13:22

I was ridiculously excited when I first read about this in October (if I remember correctly) last year, when a few of the first results were beginning to pop out. I found the methodology fascinating. First of all the digital unwrapping of the scrolls, then the recognition that crackling in the paper was the sign of ink, and finally putting together a model to detect it, piece by piece. I need to look into the final repository to understand what exactly they did, but they seem to have used a TimeSFormer. I'm confused by this choice as I thought it was for video. How did they apply this to images? In the end though, what a wonderful day for archeology. These young minds deserve a huge round of applause for what they have achieved.

marcyb5st · 2024-02-06T09:37:59

my understanding is that the scan they did on the scrolls returned the layers themselves. Like so:

```

xxxxxxxxxx

...

xxxxxxxxxx

```

So, by tiling the image on the surface you get data that is size_x * size_y * n_layers. So, it can be seen as a video stream with size_x * size_y * 1 channel * n_layers where the layers replace the temporal dimension.

namaria · 2024-02-06T10:15:05

They explain it on the methodology sections. The scans result in a stack of tiff images that can be rendered as videos of the scan or as 3d models.

vjk800 · 2024-02-06T10:12:48

I first looked at the results and thought "this is kinda cool". Then I proceeded to read about the whole competition and how thoughtfully it was organized and thought that "this is extremely cool". I wonder what else could be achieved with such a well designed incentive program.

kretaceous · 2024-02-05T16:45:12

When I first came across this project on HN (early last year), I was taken aback by how impossible the project looked and how smart were people working on this. Despite seeing a few intelligent names behind the project, I subconsciously believed that this would at least take 5-10 years before a breakthrough.

Today I sit with the same amazement, taken aback again, appreciating how ridiculously awesome this is. Congratulations to the winners and everyone involved!

qingcharles · 2024-02-05T19:45:03

So many things that look insane are becoming a reality. You look at those scrolls burnt to a crisp and the idea of reading them is nonsense.

The fact I have a computer writing flowery alt text descriptions of my photos with unnerving accuracy is something I would not have predicted for another 20 years. But, here we are...

Keyframe · 2024-02-05T23:04:27

Right? Imagine trying to explain some of it to one of the ancients - so, you have this quartz sand, see?...

p-e-w · 2024-02-06T04:30:37

That would be like explaining a baking recipe by starting with protons and electrons.

State-of-the-art machine learning architectures aren't actually that complex. Diffusion models and transformers can be explained to a bright high schooler. I'm sure Archimedes and Euclid would have no problem understanding them.

What they might have a problem understanding (or even imagining) is the mind-boggling amount of computation required to make those systems do anything useful. Getting Llama to produce a single token of text takes more calculations than all of humanity did by hand during all of Classical Antiquity.

hnfong · 2024-02-06T04:49:07

I think the quartz sand metaphor was to illustrate how advanced our silicon-based technology has become, not just the ML parts.

Imagine all the stuff... transistors, Turing/Von Neumann machines, lithography, theoretical computer science, OS and compilers, the Internet... and lastly there's modern day machine learning that builds on top of all the above.

The base level stuff isn't exactly protons and electrons, but given the nanometer scale of our chips, it's not that far away from the truth, and we (humanity) has somehow built amazing stuff on top of that.

Vespasian · 2024-02-06T10:16:39

The basic of concept how such a thing is powered (electricity) is so far removed from anything people did in ancient times that it would be hard to get them to understand that neither gods nor magic are invovled.

Smart "intelectual" people would certainly be willing to challenge basically everything they assume about nature, but I don't think your run of the mill farmer would be able to do that.

p-e-w · 2024-02-06T03:09:09

It's all about incentives. $1 million is a lot of money. The vast majority of hard problems don't have much brainpower dedicated to them, because the bang/buck ratio doesn't work out. Machine learning, math, and adjacent fields already have many careers that pay very well, so getting top-notch experts to dedicate their attention to what might be a futile endeavor is difficult.

And this isn't only about the monetary value itself, but also the fact that a large cash prize attached to a challenge boosts the prestige of finding a solution. Nobel Prizes come with about a million bucks on top of them, after all.

I'm quite confident that if someone offered $100 million for deciphering the Voynich manuscript or Linear A, we'd have a solution within 3 years.

CityOfThrowaway · 2024-02-06T03:54:00

I'm 90% sure the people that did this project did it because they got nerd sniped by it and got to hang out with nat while earning a reasonable salary

BenFranklin100 · 2024-02-06T03:41:32

Not to be argumentative, but $1M isn’t very much money, certainly not for a project of this scope. It’s a testament to the creativity, competence, and dedication of those involved they’ve gotten this far with such little funding. Hopefully their early success will attract more resources to this very worthy project.

namaria · 2024-02-06T09:36:09

I think most of all this is a testament to just how much raw talent and intellectual potential is locked up in the winner-takes-all dynamics and shortsightedness of the stock market. Imagine the exploits and results in a world where everyone had the baseline resources and opportunities for extra funding for pursuing niche interests.

p-e-w · 2024-02-06T04:16:38

What do you mean by "of this scope"? The winning solution was produced by students and interns who coordinated over the Internet, in less than a year. The problem isn't scope, the problem is attracting lots of bright individuals to work on such a task (for free). And offering a substantial monetary incentive to the winner is probably the best way to do that.

And yes, $1 million is very substantial for an individual. And the cool thing about offering it as a prize (from the point of view of the organizers, that is) is they only have to pay one person or team, although potentially thousands ultimately contribute to the solution, directly or indirectly.

xipho · 2024-02-06T05:33:53

That's not at all what they did. They explicitly made endpoints to doll out the prizes, to ensentivize collaboration. That tactical aspect of the the whole project and how they set it up is worth highlighting on its own.

fragmede · 2024-02-06T04:11:58

it's $700k divided three ways, too. $234k is well within FAANG compensation range, but you get to work on such an awesome project.

jobs_throwaway · 2024-02-06T05:43:33

FAANG compensation range but no benefits and 100x the risk

Make it $8m or $12m and FAANG employees can actually start to justify working on it seriously from a money perspective

varjag · 2024-02-06T10:15:44

Most of them are not good enough to make a dent in this task.

ryneandal · 2024-02-06T01:23:55

Herculaneum was one of the highlights of my trip to Italy with the wife. I didn't realize the scope of just how much ash and soil had to be removed for excavation. It was dozens of meters [1]. It's an absolute shame that the site is given a fraction of the attention that Pompeii receives, I thought it was vastly better preserved and truly awe-inspiring [2].

I highly recommend spending a few hours wandering the site, it is an absolute wonder.

1: https://www.icloud.com/photos/#08dJAA5eM9jpbhlEa3fzkl5ng 2: https://www.icloud.com/photos/#076Pof4FziA7WgcI8hZrGZmzg

beautron · 2024-02-06T09:30:29

I enjoyed the attention given to Herculaneum in a computer game called Rome: Pathway to Power (released in 1992). You start the game as a slave who has to escape Herculaneum before Vesuvius erupts. I loved the game as a kid. It's sort of like an isometric immersive sim (with a clunky interface). It got me interested in ancient Rome.

I hope to visit Herculaneum some day.

jtchang · 2024-02-05T16:55:03

“Any sufficiently advanced technology is indistinguishable from magic.”

Absolutely insane the level of wizardry being applied here to turn a lump of blackened, charred scrolls into readable text.

Having only cursory knowledge with machine learning are some of the techniques used in the article only recently discovered or have they been around for a while?

Is it due to us having reached an inflection point with these types of algorithms that they have become more popular and thus we are seeing new ways to apply them to old problems?

kortex · 2024-02-06T02:41:15

There has definitely been a virtuous cycle between GP-GPU processing capability, algorithms, libraries and software that use that hardware, and researchers working with those tools.

echelon · 2024-02-06T05:56:26

> Absolutely insane the level of wizardry being applied here to turn a lump of blackened, charred scrolls into readable text.

Imagine what we'll be able to do to brains, dead or alive, in 100 years.

And in 10,000, maybe we'll be reconstructing the light cone. Maybe that's what we are right now. (Not serious, but it's a fun thought experiment.)

xvector · 2024-02-06T06:28:51

This is why I am going for cryopreservation if I ever have the luxury of choosing the way I die.

echelon · 2024-02-06T07:21:02

Well, that could go lots of ways. Maybe some rich trillionaire buys you and spawns you into an endless horror simulation. They might be into torture and get off on it.

("No real humans harmed.")

But if the future can reverse the light cone, nobody is immune to that fate.

Who knows what the future holds. These are just sci-fi flights of fancy.

Hedepig · 2024-02-06T10:17:22

Argh, what has Black Mirror done to our sense of optimism?

(I agree with you).

exe34 · 2024-02-06T08:08:25

I remember learning about ancestor simulations by the vile offspring in accelerando, but reversing the light cone is quite chilling - is there any sci-fi novel that deals with that you would recommend?

jdminhbg · 2024-02-05T17:30:28

Here is the link to their "master plan" to read all of the excavated scrolls: https://scrollprize.org/master_plan

It looks like there are two main bottlenecks to reading more: the need for manual intervention in segmenting the scanned scrolls, and the cost in scanning new scrolls.

tysam_and · 2024-02-06T05:49:11

Funding is a huge one as well. Funding is the wheel that drives the project (source, have been hanging around the project people for a little while).

If you know anyone that would help chip in for the Phase 2 of the project (scaling up, please let Nat know! (not directly affiliated with the project management team, just pointing to him as a great contact for that....

riffraff · 2024-02-06T06:20:25

It seems "weird" none of the mega rich has committed a few million dollars for this, it looks like a very good way to build a legacy while benefiting humanity, and e.g. Bezos would probably find a million dollars behind the couch pillows.

ChainOfFools · 2024-02-06T07:07:26

it's almost nauseating to me that every month or so our nation deminstrates it is capable and willing to collectively chip in enough money to turn one random nobody into a near-billionaire, muvh of which gets promptly vaporized on drugs and tacky status symbol purchases for themselves and maybe some immediate family, when the same money would fund a hundred Vesuvius Challenges a year at several times the scale of this project.

kilroy123 · 2024-02-05T18:33:12

What a refreshingly clear and thought-out plan. This project honestly gives me a lot of hope.

Animats · 2024-02-05T20:59:08

Yes. Now it can be done, but costs too much. Once they get a scanning unit near the scrolls, it will be much cheaper. The data reduction will probably get cheaper, too.

s0rce · 2024-02-06T04:24:30

Scanning unit? Seems like the scanning was done using a synchrotron beamline. Maybe there is a suitable beamline at Elettra. I haven't looked closely why the synchrotron is needed. A Sigray instrument might work here, or even something simpler.

ok_dad · 2024-02-05T21:46:24

As for scanning: $30mm doesn't seem like a ton of money to scan 800 scrolls with untold history and other works, compared to other uses of that amount of money I could name now. Maybe someone will donate that cost and perhaps all the scrolls can be transported at one time or in a few bigger groups to be closer to the particle accelerator. Another million bucks and I bet you could build a climate-controlled container to take them all at once, or something. If I had $30mm I would definitely donate to this cause, it seems like one of the best uses of that kind of money I can think of. That would bypass the need to research and develop a bench top scanner or another solution. You could even crowdfund this!

As for segmentation: get some sort of collective solution going, like the Seti@Home did, but for people who are bored as hell, instead of them scrolling Reddit or Twitter all day. Maybe do it like a CAPTCHA so you get it done for free? I'd segment for a few hours a month if I had the ability to do so.

This is a cool project that has taken a community to build to this point, why not try and open and expand the collective of humans working to understand the scrolls? Get millions of people involved and you don't need to rely on technological crutches and development, though that is not the worst way to go either.

BurningFrog · 2024-02-05T23:45:11

At $30mm, they'll have billionaire philanthropists lining up around the block to get their name on this!

alach11 · 2024-02-05T16:25:27

This is the coolest thing I've read this year. It reads like science fiction. Who would even imagine it's possible to read text from a 2000 year-old rolled up burnt-crisp paper?

dougmwne · 2024-02-05T18:40:12

It’s a 270 year archaeological and technological culmination. The scrolls were dug up in 1752. It took the collective developments of the Industrial Revolution, the sciences all our engineering and manufacturing prowess to discover, preserve and scan the scrolls. Then the final cherry on top of the current AI revolution that can create inferences and connections that are beyond the human mind to even understand. And out pops 2000 year-old wisdom of the ancients.

jfengel · 2024-02-05T22:58:58

One other thing that was required: the patience not to ruin it.

So much has been lost to well-meaning archaeologists who dug up and threw away things that they didn't think were important. They tried cleaning and preservation techniques on artifacts without testing, sometimes ruining them in the process. They ripped things out of context, and "restored" them based on guesses that were sometimes flagrantly wrong.

Of course they couldn't be expected to know everything that would come in the future, so blame can sometimes perhaps be muted. But it's especially positive that they extricated these particular objects very carefully and just waited for a way to extract information that they could hardly have hoped for.

bee_rider · 2024-02-05T20:07:54

It is a bit fitting that it turns out the scroll is about the relationship between enjoyment and abundance.

It looks like, from what we can gather, the author decides that should something be hard to get, that doesn’t lead to greater enjoyment. But, it seems that the archaeologists have found an awful lot of joy in how “rare” access to these scrolls is!

ssnistfajen · 2024-02-06T05:52:16

I remember first reading about Herculaneum Papyri more than a decade ago, and pondered about them being read one day. After all, research into virtually unwrapping these scrolls had been ongoing since 2007 (https://en.wikipedia.org/wiki/Herculaneum_papyri#Virtual_unr...), but I certainly did not expect it to happen so soon. Exponential technology acceleration once again proves itself true.

linsomniac · 2024-02-06T00:14:24

Speaking of "reading like sci-fi", what's that book where they scan an entire library of books, descructively, by feeding them into a "book chipper" like device that chops the books up into little pieces, vacuums those pieces up and scans the pieces as they flow through, reconstructing the original text by putting the scanned results together like so many jigsaw puzzles? It was a subplot of the book, but I can't for the life of me remember what book it was.

saled · 2024-02-06T01:58:31

FYI it seems ChatGPT could have answered this for you.

> The book you're describing sounds like "Rainbows End" by Vernor Vinge. In this near-future sci-fi novel, set in 2025, one of the subplots involves a project called the "Library Project," where the UCSD (University of California, San Diego) library decides to digitize its entire collection. The process is somewhat as you described: books are destructively scanned by being shredded into tiny pieces, which are then scanned and digitized, with the text being reconstructed from the scans. This process is a part of the broader themes of the book, which include the effects of technology on society and the concept of "wearable computing" and augmented reality. Vernor Vinge, a retired San Diego State University professor of mathematics, computer scientist, and Hugo Award-winning author, is well-known for his works in the science fiction genre, especially for exploring the concept of the technological singularity.

gwern · 2024-02-06T02:23:09

I'm not surprised ChatGPT can answer it - I'm not sure why, but _Rainbows End_ is one of the most commonly-asked about SF books like that. Everyone remembers the book-tornado doing shotgun sequencing, but they can never remember its name or anything else that happens. I guess that's the problem with having a technology whose mental image is so compelling but also mostly disconnected to the rest of the book. (I know I can't tell you much about the rest without rereading the WP entry.)

prezjordan · 2024-02-06T00:24:34

http://www.technovelgy.com/ct/content.asp?Bnum=1109

Rainbows End by Vernor Vinge (ChatGPT helped with the search)

yourapostasy · 2024-02-06T05:42:10

I’m now wavering a bit on my earlier dismissal of people freezing their bodies for an indeterminate future revival. I could probably get into a science fiction story with this premise:

Instead of relying upon machinery, some zillionaire has their body dry frozen and stashed in a lunar south pole crater, with a foundation funding interstellar propulsion research to move the body to the coldest stable points discovered along the way towards the Boomerang Nebula (1° Kelvin) and research to revive back from burnt-crisp state.

The foundation incites all sorts of advancements along the way like working out practical fusion and ever more exotic energy generation, AGI, gravity manipulation, Drexlerian nanotech, Dyson swarm, star wisps, self-modifying bodies and so on, in its quixotic quest to fulfill its mandate.

namaria · 2024-02-06T09:50:14

I kinda wanna write an horror story about people freezing their bodies or heads only to be revived in the future bat shit insane from the excruciating experience of existing for several decades in a sort of limbo...

noduerme · 2024-02-06T09:45:26

>> “as too in the case of food, we do not right away believe things that are scarce to be absolutely more pleasant than those which are abundant.” However, is it easier for us naturally to do without things that are plentiful? “Such questions will be considered frequently.”

This reminded me, since they're scarce but also abundant... Has anyone actually eaten these giant waterbugs at Nue in Seattle? Is that like, a reasonable thing to subject a date to?

kilianbutler · 2024-02-06T10:07:25

The rewriting history comment from Nat Friedman is super interesting. It'll be amazing once this data passes into the hands of historians.

bglazer · 2024-02-05T17:15:27

One aspect of archaeology that I really find fascinating is the practice of leaving certain artifacts unexplored. The original discoverers of the scrolls tried to unroll a few, apparently found it was impossible without completely destroying the scroll, and then just left the rest undisturbed. Rather than pushing forward and destroying everything, they left these as a mystery for a future age. Two centuries (!!) later we can finally begin to understand these, with the aid of technology that would be utterly unthinkable to those people who very thoughtfully restrained themselves.

unsupp0rted · 2024-02-05T17:49:33

> Rather than pushing forward and destroying everything

In the early days they wouldn't have accomplished anything by pushing forward, so it doesn't take all that much restraint.

I'm more impressed by people in, say, the 1990s or early 2000s. They might've had a shot but there was still too much risk, so they restrained themselves until it was a safer bet.

gwern · 2024-02-05T18:35:41

Yeah, I can't give the King's men much credit here. They destroyed a lot of scrolls, and it was only because they weren't getting much of anything that they stopped and abandoned excavations or focused on digging out sculptures they could show off (many now in the Getty Museum - great museum, but I did feel a bit melancholy thinking about the scrolls while I was there in 2019).

klyrs · 2024-02-05T20:33:09

On the other hand, we ground up mummies for paint to the point that we ran out and used fresher corpses to meet demand.

It is a bit of miracle that they were preserved, and not just discarded.

ska · 2024-02-05T21:34:34

Worse than that, lots were ground up (and consumed) for medicines.

UberFly · 2024-02-05T23:39:01

In one of the Futurama episodes, Fry eats one of Farnsworth's mini mummies and Farnsworth is upset because he wanted to eat it. Fry said it tasted like jerky I think.

wayvey · 2024-02-05T21:08:35

Where can I read more about this? Frankly I'm a bit surprised that I've never heard about this considering how shocking it sounds.

klyrs · 2024-02-05T21:11:26

Wild, isn't it? Hands down my favorite historical fact learned in 2023.

https://en.m.wikipedia.org/wiki/Mummy_brown

aruggirello · 2024-02-05T21:31:51

Wild? Wait until you read about people actually eating mummies and corpses

https://www.nationalgeographic.com/history/article/mummy-eat...

boffinAudio · 2024-02-06T09:37:14

The things Futurama made me look up on Wikipedia...

klyrs · 2024-02-05T21:40:30

Wow. I like aged cheese but that is a bridge too far.

topper-123 · 2024-02-05T22:37:23

One aspect of that time period is they absolutely idolized the romans. A lot of education at the time consisted of learning latin and at the same time people were well aware that only a fraction of the classical texts had been preserved. I find it very believable that they understood the significance of preserving and potentially unlocking these scrolls.

dmurray · 2024-02-05T17:27:59

An example of the same thing at a macro level:

https://www.smithsonianmag.com/smart-news/archaeologists-reb...

_a_a_a_ · 2024-02-05T18:20:23

Bigger yet by far https://en.wikipedia.org/wiki/Mausoleum_of_the_First_Qin_Emp...

He of the terracotta army. Not excavated yet for fear of damage, but I would so love to know...

cameron_b · 2024-02-06T02:57:49

The feeling you get when you’ve gone into one of those aircraft hanger-size buildings and then you see some of the information they’ve gotten with ground tests ( radar, mercury, etc ) is wild. The site is huge.

One of the suppositions is that the main chamber contained a model of his entire kingdom, replete with rivers of mercury.

So yes. Archaeology is a bit destructive, and sometimes the destruction can go both ways. Proceed with caution.

tobinfricke · 2024-02-05T20:10:23

Similarly, there are large sections of Pompeii, which remain unexcavated -- left for the future.

Tronno · 2024-02-05T20:43:42

Herculaneum, where these scrolls are from, is 75% unexcavated! And it will likely remain this way for some time, as Naples sits right on top of it.

ffgjgf1 · 2024-02-06T07:26:06

The town of Ercolano sits on top of it. Of course effectively it’s a suburb of Naples these days

junon · 2024-02-06T09:58:05

Wow. I remember this being announced and thought it'd be a while or that it wouldn't be possible. Very happy to be wrong!

countrymile · 2024-02-05T16:57:46

Quite incredible work, with the original breakthrough model being trained on a 1070: https://twitter.com/LukeFarritor/status/1754532281690243339

sangnoir · 2024-02-05T18:24:53

Large Language Models have skewed the perception on the amount of compute required to do useful things with ML.

mNovak · 2024-02-06T03:46:00

The scroll segmentation looks like it'd be a very manageable task for distributed volunteer work too, at least as a kickstart if that's really the bottleneck now. Just a 100 or 1000 halfway dedicated volunteers (small by internet standards, for a useful and simple task) could make a big dent. Don't know how many scan layers are in a single scroll, but for instance Project Gutenberg's proofreading network has processed millions of pages, one by one.

sekai · 2024-02-05T15:33:05

> We estimate that the scrolls we have in Naples contain more than 16 megabytes of text. Some members of our papyrology team say that revealing this text will be the greatest revolution in the classics since the Renaissance

Amazing achievement, let's hope the Italian government allows for additional excavation of the villa.

riffraff · 2024-02-05T16:11:40

they likely would, Pompeii and Herculaneum are _still_ being excavated after two centuries, it's not like things are still.

But we have only read 5% of this scroll and there are a ton more already excavated, it will probably take years before we manage to process what we already have.

seydor · 2024-02-05T17:25:32

> it will probably take years

In the direction things are going ... maybe a few months :)

pimlottc · 2024-02-05T18:05:17

There’s more to processing than scanning, it has has to be reviewed, transcribed and translated by linguistics experts, and then analyzed and studied by academics and researchers who can put in context and integrate it with what we currently know about history, cultural, philosophy, etc of the time.

kristopolous · 2024-02-05T23:27:32

The biggest bottleneck is getting the experts to read it. You need a decade or so of graduate level education and interpreting things takes apparently quite a long time.

Maybe that's another AI application.

digging · 2024-02-06T00:17:39

As much as I'd like to endorse study of the classics, I'm almost certain that AI will be better at interpreting the texts than humans very soon.

ffgjgf1 · 2024-02-06T07:32:01

GPT 4.0 isn’t even remotely close to being useful at all in this case so I have some doubts about ‘very soon’

Digory · 2024-02-05T20:59:44

If you can automate the input, you can probably automate much of the basic analysis (things that would be "revolutionary" to undergrads).

"ChatGPT, give me the highlights of these ancient Greek scrolls ..."

wl · 2024-02-05T17:59:35

The big problem is that the Villa of the Papyri is underneath modern buildings. That doesn't mean that excavation without demolition is impossible (see the Scavi underneath the Basilica of Saint Peter), but it makes things far more difficult.

BlueTemplar · 2024-02-05T21:51:24

If the prospect is very high to multiply by several times the total remaining classical works, I doubt that the money will be particularly hard to find ?

wl · 2024-02-05T22:54:45

David W. Packard (HP heir) has been trying to throw money at doing this for years, so the money isn't as much of an issue as you'd think. The larger issue is that the locals don't want digging underneath their buildings, no matter how careful the excavators are. Also, all the money that would be necessary to excavate has made the project a target for the mafia who wants to get their share.

jobs_throwaway · 2024-02-06T05:48:07

> the money isn't as much of an issue as you'd think. The larger issue is that the locals don't want digging underneath their buildings, no matter how careful the excavators are

This sounds like the money IS a huge issue. How expensive can it be to buy out the locals? We're talking about priceless cultural artifacts

jacquesm · 2024-02-06T07:08:45

> Also, all the money that would be necessary to excavate has made the project a target for the mafia who wants to get their share.

I wonder how far Italy will go once - if - they get rid of the mafia, it is like trying to drive a car with the handbrake on.

dang · 2024-02-05T19:31:01

The Vesuvius Challenge - https://news.ycombinator.com/item?id=35322809 - March 2023 (32 comments)

Vesuvius Challenge - https://news.ycombinator.com/item?id=35169869 - March 2023 (32 comments)

dang · 2024-02-05T19:31:24

From today there's also this article, which maybe goes into more background (I haven't checked):

Can AI Unlock the Secrets of the Ancient World? - https://news.ycombinator.com/item?id=39261465 - Feb 2024 (1 comment)

and this tweet which presumably covers the same ground as OP:

The $700k Vesuvius Challenge prize has been won - https://news.ycombinator.com/item?id=39261933 - Feb 2024 (2 comments)

codeulike · 2024-02-05T18:20:50

Over 2000 years ago a chap called Philodemus sits in the library of a luxurious villa owned by a rich guy who likes collecting art and writing. He writes his thoughts on pleasure, and the relationship between the quantity of something and the pleasure that might derive from it. The scroll goes on the shelves with the others. He writes lots. At some point later the villa is covered by lava from mt vesuvius.

2000 years later we scan the carbonised scrolls with (basically) magic rays and use thinking machines to reconstruct what Philodemus wrote.

I wish we could tell him. Sounds like he was a thinker, he would really appreciate it.

clawoo · 2024-02-05T19:25:57

I have too wild imagination sometimes. I picture him with a shocked look on his face, similar to what one of the modern 'thinkers' would have if they forgot to clear their browser history and somehow someone restored it in the future.

"You recovered... uh... everything?"

cooper_ganglia · 2024-02-05T20:54:52

Your comment prompted me to go in search of something I'd seen several years ago: something about an advertisement in Pompeii for prostitutes, or something like that. Anyway, I couldn't find exactly what I went in search of, but I did stumble upon this oddly specific, yet interesting, Wikipedia entry:

https://en.wikipedia.org/wiki/Erotic_art_in_Pompeii_and_Herc...

Priapus had it goin' on! Reading the Priapeia for the first time is a treat...

rrr_oh_man · 2024-02-05T21:03:17

> something about an advertisement in Pompeii for prostitutes, or something like that

Maybe something along these lines?

https://en.wikipedia.org/wiki/Lupanar#Graffiti

bee_rider · 2024-02-05T19:59:34

That recovery is much easier though, it is using the device as intended.

And of course, rm just unlinks, doesn’t actually delete, so even going a step further and recovering deleted content is hardly magic.

This is more like if, sometime in the future, they somehow successfully reconstructed a snapshot of our computers’ volatile memory by examining the power supply, or something ridiculous like that.

the8472 · 2024-02-06T09:54:16

> rm just unlinks, doesn’t actually delete,

On HDDs. On SSDs it'll lead to now-unusued space getting TRIMed which actually erases the blocks. Back to scraping the papyrus.

nindalf · 2024-02-05T20:57:00

Big if.

We’ll lose a lot of digital data simply because we won’t have the means to read it. CD-readers aren’t manufactured anymore in volume. It’s easy to imagine society in 40 years not having any CD readers handy but having a bunch of CDs they want to read. Now multiply that by all the funny storage formats we’ve created over the years.

bornfreddy · 2024-02-05T21:19:10

No need for a CD reader if you have a CT scanner and software that converts those ridges into bits. The bigger question is how well preserved those CDs will be.

bee_rider · 2024-02-05T21:20:43

I’m almost certain it is impossible to actually do what I said. But then again, I bet anybody 2000 years ago would say the same of reading scrolls that have been consumed by a volcano!

gwern · 2024-02-06T02:32:41

Epicureans were atomists (https://en.wikipedia.org/wiki/Epicureanism#Physics), so if you explained it to him, he might not be nearly as shocked as most people of the era would be. Epicureans also tended to invoke extremely insubstantial 'images' composed of especially tiny gossamer arrangements of atoms as explanations for things like dreams, and you can see how well that would work as an analogy with using X-rays to look at subtle changes in the arrangement of atoms in charred scrolls. (These are covered in https://en.wikipedia.org/wiki/De_rerum_natura - if you have some time, I highly recommend reading Stalling's rhyming translation. You'll be shocked and admire the rationality & scientificness of Epicurean materialist atomist explanations of the world, even where they get it totally wrong.)

jacquesm · 2024-02-06T07:06:39

It's a testimony to the power of mind equivalent to Einstein's 'Gedankenexperimenten'. The only reason they got some of it totally wrong is because they lacked the scientific apparatus to test their hypothesis, but they were on the right track.

codeulike · 2024-02-05T21:33:40

... and what would he think of us, 2000 years later thought has led us to amazing advances in technology that would be inconceivable to him. But the sort of things the people of his day thought about, virtue, happiness, how to live the right way, not a lot of progress has been made. He might be surprised by that. In his time at the dawn of written thought-for-thoughts sake, they might have reasonably expected that soon people might think their way to a golden age of happiness and contentment. But 2000 years later we have learnt that you dont seem to be able to think your way to happiness.

dougmwne · 2024-02-05T22:14:55

I think we have learned not that you can’t think your way to happiness, but that for the sake of the consumer economy and worker productivity it’s best that you don’t.

mkl · 2024-02-06T01:07:08

Ash, not lava.

ChatGTP · 2024-02-05T22:19:48

Maybe in 20 years when resurrection is possible (singularity event) we'll be able to let him know?

ptelomere · 2024-02-06T04:29:56

My God, once in a while, you need to read something like this, reflect, ask the question "can I do better in my work ?", such an inspiring story of technical feat, persistence, and ingenuity.

iandanforth · 2024-02-05T20:16:05

I was recently reading the Greek Myths series by Stephen Fry and he makes a point of how there are so many stories we know of but have been lost from ancient texts. Stories and authors that were famous enough to be mentioned by multiple other authors but which themselves have been lost. This collection of scrolls could contain some of those lost stories and the possibility of that is terribly exciting.

0x_rs · 2024-02-05T22:47:03

>stories we know of but have been lost from ancient texts

So many there's a lengthy list on Wikipedia about it. It's fascinating reading ancients casually referencing works that we otherwise know nothing else about. Without the careful, laborious copying (often imperfect) over the centuries most things would've been lost completely. There's also other works such as maps that did not survive, the Tabula Peutingeriana for example is thought to be a derivative work of one commissioned by Augustus of the known world at the time (to Romans) and of which there's a few mentions in some works by historians at the time.

https://en.wikipedia.org/wiki/Lost_literary_work

xinayder · 2024-02-06T09:43:02

A great example about lost work is that the insights we have onto Viking mythology was pretty much documented by a single guy, Snorri Sturluson. What we know about Norse mythology is just a tiny piece of their mythos, as they didn't have the habit of writing down their tales/legends/stories and most of it got lost after they converted to Christianism.

laichzeit0 · 2024-02-06T09:12:47

Seutonius' Lives of Famous Whores is a lost text I've always hoped we would recover at some point.

arbuge · 2024-02-05T22:27:38

And those are just the known unknowns. Besides those, to borrow from Rumsfeld, there are the unknown unknowns.

dekhn · 2024-02-06T01:03:50

I would love to read the Telegony. Homer did such a good job with episodes I and II that I'm really curious how the story ends.

colechristensen · 2024-02-05T21:13:44

Not just works of fiction or mythology but also histories, and works of science and philosophy.

eszed · 2024-02-05T21:30:59

I want Aristotle's treatise on Comedy, and if I'm allowed to be terribly greedy, just one more play by Aeschylus.

This is the most exciting thing in the world to me right now, these scrolls, along with the thought that there might be literally thousands more still in the ground.

CamperBob2 · 2024-02-05T23:00:35

Might want to put on a pair of gloves and a respirator if you find that Aristotle volume. Some people were pretty offended by it, I understand.

dekhn · 2024-02-06T01:00:44

We truly live in a golden age of physical and mathematical discovery. To get here required many thousands of years of technological development.

The developers of the transformer, as a group, should win some sort of significant prize; it has had more impact in a short time than anything I've seen before. Will we find better architectures in the near future?

laichzeit0 · 2024-02-06T04:14:59

Just don’t say something like this out aloud near Gary Marcus.

ComputerGuru · 2024-02-05T20:15:24

Better but less technical writeup: https://www.bloomberg.com/features/2024-ai-unlock-ancient-wo...

manyty · 2024-02-06T07:45:50

Netflix documentary soon please!

Big ups for the winners - this is so cool and hopefully can be replicated for deciphering many other lost manuscripts.

jaredhallen · 2024-02-05T22:37:40

I'm sure the original authors didn't expect to be incinerated by a volcano (and I'm also sure that the future legibility of their writings would be the very least of their concerns!), but it really bends the mind to imagine their reaction if they could have known how this would all unfold (unroll? Sorry...)

leoc · 2024-02-05T18:36:31

Apparntly the bad news is that the remaining scrolls most likely contain yet more Epicurean philosophy, maybe largely from not-top-rated guys like Philodemus. (Apparently it's possible that the library actually is, or incorporates, Philodemus' personal library.) https://twitter.com/DrFrancisYoung/status/175453630645602754...

dougmwne · 2024-02-05T18:51:29

That’s quite premature, but even if we are looking at a personal library containing only personal writings, you’d be looking at a massive increase of information on the ancient world, like a neural map of a single ancient mind that contained all their experiences and thoughts.

The worse case would be that it was 800 copies of the same scroll waiting to be sold off to other libraries.

Leo_Germond · 2024-02-05T18:58:07

All the 847 chapters of Philodemus fan fiction of MLP (my little Plato)

r_klancer · 2024-02-05T21:35:58

That's not necessarily bad news.

Everyone interested in this story should read Stephen Greenblatt's The Swerve (https://www.pulitzer.org/winners/stephen-greenblatt).

It traces the story of a Renaissance humanist who tracked down and translated the Epicurean philosopher/poet Lucretius' De Rerem Natura, which Greenblatt describes as portraying a strikingly modern way of seeing the world.

In particular Lucretius and the Epicureans denied the existence of supernatural causes, were opposed to religious fear, and posited the ideas of atomism and biological evolution. Of course they're better known for their approach to living life, which Greenblatt shows is more sophisticated than sometimes caricatured, and which he portrays as a breath of fresh air compared to the oppressive moralism and hypocrisy of the Church at the time. (Jefferson and many of the American Founders described themselves as Epicureans.)

He goes on to imply that Epicureanism was influential and widespread in the ancient world but suppressed by the early Church, so that we now know little of it.

Anyone, one of the tantalizing parts of the book is where he describes the carbonized and unreadable Herculaneum scrolls, since they were the private library of a wealthy patron of the Epicureans. I think he thinks being able to read the scrolls will really change our understanding of the ancient world.

And remember: if they hadn't been carbonized, they would have crumbled to dust. That's why we only have the texts that managed to get copied. (Anthony Doerr's Cloud Cuckoo Land is a novel about the survival and 21st century rediscovery of an imaginary Greek play, and ... I'll let you read it yourself - https://www.anthonydoerr.com/books/cloud-cuckoo-land)

(Apologies for any errors above, as basically all I know about this subject is what I read in the book!)

digging · 2024-02-06T00:29:47

> Everyone interested in this story should read Stephen Greenblatt's The Swerve (https://www.pulitzer.org/winners/stephen-greenblatt).

It's interesting reading for a layperson, but as with any other pop-history book, one should read this with a heaping plate of salt at hand. (I'm... not sure what that metaphor actually means or if this is an appropriate way to extend it.)

Things are always more nuanced than can be laid out in a sweeping narrative format and the compression required can lose some critical information, even with the best of intentions. There's also just getting things wrong, which most non-historians do and many historians will do on topics that aren't their expertise.

I'd read this criticism from AskHistorians (not infallible, I know)

https://old.reddit.com/r/AskHistorians/comments/ejfxe5/comme...

riffraff · 2024-02-06T07:33:58

The "grain of salt" reference relates to some antidote, which contained a grain of salt. The reduction to "handle with care" is modern.

So the extended metaphor makes no literal sense according to the Pliny text, but it makes sense according to our interpretation of it, which is what matters.

telotortium · 2024-02-06T08:47:25

I realize citing Wikipedia risks some serious error, but my impression is that by late antiquity (after AD 200), the main philosophical systems in the Roman world were Christianity and Neoplatonism (itself heavily influenced by Christianity) and to a lesser extent Stoicism. Stoicism, Epicureanism, and Middle Platonism were more characteristic of Classical Antiquity (200 BC-200 AD). The Wikipedia page on Epicureanism[0] supports this impression: "By the late third century CE, however, there was little trace of its existence.[7] With growing dominance of Neoplatonism and Peripateticism, and later, Christianity, Epicureanism declined."

[0] https://en.wikipedia.org/wiki/Epicureanism

akprasad · 2024-02-05T22:25:17

You might also be interested in the Charvaka school of ancient India [1], which is a close counterpart to Epicureanism. The Charvaka school was likewise influential and widespread, and it likewise become obscure over time, for reasons I don't know.

[1]: https://en.wikipedia.org/wiki/Charvaka

generationP · 2024-02-05T23:09:20

Probably true, but there are more rooms in the Villa yet to be excavated. What we have is essentially a bookshelf in a larger library. If it is sorted by alphabet, it might be representative, but what if it is sorted by topic?

foota · 2024-02-06T08:06:18

I could help but wonder while reading this how everyone involved would feel if the "last step" involving the papyrologists was automated as well.

newzisforsukas · 2024-02-06T09:07:17

Amazing? As whomever made it would have likely created an incredibly powerful generally useful tool. That would be much more exciting than probabilistically distinguishing layers of ink in soot.

est31 · 2024-02-06T01:56:40

This is great news for preservation efforts as we have a large set of stuff that can't be opened.

I wonder what this means for the maya codices, many of which are in similar shape: https://en.wikipedia.org/wiki/Maya_codices#Other_Maya_codice...

darkwater · 2024-02-06T08:46:06

Even if I really do think this is an amazing effort, I'll spoil the techno-optimist party here by noting that's a bit sad that this kind of achievements are pulled off only by rich patrons that became millionaires doing another unrelated thing and they just want to scratch a personal itch/curiosity. Yeah, patrons and maecenas have been around forever, but I would really prefer that we as society were mature enough to achieve such things collectively. We did it, at least for a few decades and even if coming from wrong incentives, with space exploration.

imranq · 2024-02-06T04:16:49

The methods are incredible, but it seems like the text is ordinary and nothing to write home about

namaria · 2024-02-06T10:10:16

We don't decipher ancient texts to have our minds blown about the universe but to know more about what life was like back then and to conduct historical research about it.

s0rce · 2024-02-06T04:18:31

Not too surprising, but I like the random ancient texts

https://en.wikipedia.org/wiki/Complaint_tablet_to_Ea-n%C4%81...

f5ve · 2024-02-06T05:58:39

There really is one for everything:

https://xkcd.com/2758/

hoten · 2024-02-06T04:18:42

I dunno, I found it pleasurable.

gmd63 · 2024-02-06T00:11:24

Given that AI is able to hallucinate, how can we be convinced the results are accurate? Did they create a new scroll, burn it, and compare the results to what was actually written?

WorkerBee28474 · 2024-02-06T00:17:24

The ML models were trained without knowledge of the scrolls' language. The models extracted images, and human experts were able to read the images as text. There was no text corpus fed in that could be leaked into the output.

namaria · 2024-02-06T10:11:38

Nobody used generative LLMs at any step of this process...

leeoniya · 2024-02-06T00:16:27

read the article

gmd63 · 2024-02-06T00:21:07

Thanks. From the article, for anyone else skimming who had the same question:

    Technical reproduction. The Vesuvius Challenge Technical Review Team reproduced the winning submissions manually. We made sure to clearly understand every part of the code, and that when we run it independently we get similar output images. Since all code and training data is now open source, you can do the same!
    Multiple submissions of the same area. You might have noticed that all submission images above show the same area of the scroll. This is because we released 3d-mapped papyrus sheets within the CT-scan (“segments”) created by our segmentation team, which were then used by all contestants. The resulting output images — created by different ML models and training labels — have produced extremely similar results. This holds not just for the winners and runner ups, but also for the other submissions that we received.
    Small input/output windows. The ink detection models are not based on Greek letters, optical character recognition (OCR), or language models. Instead, they independently detect tiny spots of ink in the CT scan, the writing appearing later when these are aggregated. As a result, the text appearing in the images is not the imagined output of a machine learning model, but is instead directly tied to the underlying data in the CT scan.

leeoniya · 2024-02-06T00:26:37

tl;dr: cross-validation between competing submissions

cooper_ganglia · 2024-02-05T21:11:02

This is amazing and very, very exciting!

It is wild to me, though, that if I have an SSD fail it's essentially unrecoverable, but a 2,000-year old, rolled-up, lava-burnt scroll of Papyrus can be read using Technology™! I love to see it!

BlueTemplar · 2024-02-05T21:58:34

Are you sure it wouldn't be recoverable using something on a similar level than they did (particle accelerator scanning) ?

pimlottc · 2024-02-05T18:00:26

This is really cool! Is the output resolution limited by the granularity of the CT scans?

thom · 2024-02-05T20:16:38

So excited to see how much text is recoverable and to what extent this can bolster our collection of Epicurean writings!

low_tech_punk · 2024-02-05T20:11:15

Lack of information retrieval technology ≠ loss of information.

nerdo · 2024-02-05T18:35:58

Incredible effort. Text of scroll is essentially "drink more Ovaltine" which is on theme.

TremendousJudge · 2024-02-05T17:17:54

This is exciting, it's very unusual for new texts to be added to our collection of ancient literature.

m3kw9 · 2024-02-05T22:06:07

Pretty cool engineering to pull this off

jacobwilliamroy · 2024-02-05T21:18:59

I don't know if anyone who worked on this project is going to read this but if you are: good job! This looked like it was really hard.

yieldcrv · 2024-02-05T20:29:20

tl;dr

The general subject of the text is pleasure, which, properly understood, is the highest good in Epicurean philosophy. In these two snippets from two consecutive columns of the scroll, the author is concerned with whether and how the availability of goods, such as food, can affect the pleasure which they provide. Do things that are available in lesser quantities afford more pleasure than those available in abundance? Our author thinks not: “as too in the case of food, we do not right away believe things that are scarce to be absolutely more pleasant than those which are abundant.” However, is it easier for us naturally to do without things that are plentiful? “Such questions will be considered frequently.”

verisimi · 2024-02-06T00:01:13

I still think this methodology can be tested by creating equivalent carbonised scrolls, where you write some specific text and check the result against what you wrote. Ie you run your test artifacts through the scanner and software, and the process should return you your test text that you wrote before carbonising the scroll. At that point you can be assured that the process is not making stuff up from noise.

But, without that process, you can have no comfort than what is occurring is valid. The software might reliably see a letter in some noise, but so? It doesn't mean the letter is actually there... One can't verify the scroll, and one hasn't verified the process.

verisimi · 2024-02-06T07:30:53

I think this suggestion of mine was downvoted last time too, but I don't get why!

One would always want to test stuff in software development, especially if it was fraught and can easily be tested.

Mostly there are no tests to be undertaken in history -hence it it's so much hearsay. But here is an opportunity to gain some genuine certainty, in a way that is normally unavailable! The implementors of this method should absolutely test their process!

namaria · 2024-02-06T10:13:27

You're being downvoted because they literally did what you suggest already.

ffgjgf1 · 2024-02-06T07:34:21

AFAIK the scanning process is pretty expensive

laser · 2024-02-05T22:52:19

This is amazing, sci-fi-like-reality. I wonder if they pull off an auto-segmentation breakthrough if the techniques might apply in other areas, like automating neuron mapping (eg. see EyeWire).

lostlogin · 2024-02-05T23:38:42

I was worried it would contain something boring like tax records, but it's even worse than that. Fingers crossed it gets a little more interesting than the average social media post.

glfharris · 2024-02-05T23:45:07

I'm pretty sure there are academics who would bite their own legs off for tax records from this period.

kilroy123 · 2024-02-05T23:58:45

Isn't that what most ancient Roman texts we do have are? Tax records.

ffgjgf1 · 2024-02-06T07:38:40

Do we have any or any significant number of Roman tax receipts? Most Roman/Greek texts that we have had to survive until ~1000 AD. If a text was available back then there is a reasonably good chance that we have it (or rather a medieval copy of it), why would anyone waste time copying tax receipts though?

dylan604 · 2024-02-05T22:08:03

"4 passages of 140 characters each, with at least 85% of characters recoverable"

oh good grief, even back then, we were limited to this value

（评论） (comments)

（评论）
(comments)