Finding a work in translation is harder than you think. Discussing the creation of Zenòdot, a cross-referencing project for books in translation, Ausiàs Tsel outlines the challenges of creating a record of translated works across different catalogues and what is lost when these records do not exist.
Le Petit Prince is one of the most translated books ever published. It exists in well over a hundred languages. But if you tried to find all those translations in a single database, you would fail. The world’s largest commercial ISBN aggregator documents editions in roughly seventy languages. The rest are scattered across national library catalogues in Tokyo, Jerusalem and Oslo, across collaborative databases like Wikidata, and across historical archives that stopped being updated fifteen years ago.
No single source covers more than two-thirds of the picture. This is not a peculiarity of one famous children’s book. It is the normal state of affairs for translated literature worldwide.
Translation discovery is not a content problem. The translations exist. It is an infrastructure problem: the metadata that should connect them is fragmented, incomplete, and unevenly distributed across institutions that were never designed to talk to each other.
A fragmented landscape
There is no global catalogue of translations. The closest attempt was UNESCO’s Index Translationum, launched in 1932 and computerised in 1979. It accumulated over two million entries across some 800 languages. But contributions from national libraries slowed, and the database has not been meaningfully updated since the late 2000s. The longest-running international bibliography of translations is, in practice, a historical archive.
What remains is a patchwork. National libraries document what is published within their borders, not what is translated from their literatures into other languages. Commercial aggregators like ISBNdb hold vast numbers of records but with language metadata that is often missing, incorrect, or ambiguous. Wikidata contains translation data contributed by volunteers, but with significant gaps and biases toward well-resourced languages. Nine national library catalogues, from Norway to Taiwan, together contribute less than ten per cent of the editions tracked by Zenòdot, an independent cross-referencing project, while commercial aggregators contribute the largest share but still fall short of majority coverage (full disclosure: I am its creator). Each source holds pieces that no other source has. Nearly ninety per cent of the ISBN-verified editions in the system appear in only one of its twenty-three sources.
Every database holds something unique. None holds everything.
Visibility and bibliodiversity
If a translation is not in a database, it is functionally invisible. For smaller languages, this is not a technical inconvenience. It is a form of cultural erasure.
Consider what happens when catalogues are connected. Catalan/Valencian, a language spoken by over ten million people, was barely visible in global translation data until its editions were documented across multiple sources. It now ranks eighth by edition count in a twenty three-source cross-referencing system—ahead of Chinese and Russian. Not because Catalan/Valencian has more translations than those languages worldwide, but because the sources that document its editions have been linked. The translations had always existed. The metadata had not.
Bengali, Thai and Urdu tell the opposite story: languages with substantial publishing industries that remain near the bottom of global edition counts—not because translations do not exist, but because the institutions that document them have not yet been connected.
The pattern is consistent: what we can see depends on which libraries have been asked, which databases have been queried, which metadata standards have been adopted. Absence from a catalogue is not evidence of absence from the world. It is a signal of unconnected infrastructure.
Cross-referencing as infrastructure
Zenòdot is an independent project that attempts to address this gap by cross-referencing twenty three sources at present—sixteen national library catalogues, two commercial aggregators, Wikidata, UNESCO’s historical Index, and community contributions, with the system designed to integrate additional catalogues as institutional partnerships develop. It currently tracks over three million editions across hundreds of languages.
The technical challenges are significant. Each source uses different identifiers. Author names appear in different scripts—the project maintains over 300,000 name aliases to match records across writing systems. Language metadata is inconsistent: what one catalogue labels “Portuguese” another splits into Brazilian and European Portuguese; what one calls “Chinese” another distinguishes as Mandarin, Cantonese, or Classical Chinese. ISBN coverage varies wildly: Japan’s National Diet Library contributes editions in one language but with granular depth that no global aggregator matches.
These are not problems that a single tool can solve. They are symptoms of an ecosystem where bibliographic institutions have operated in parallel rather than in dialogue. Cross-referencing is not an answer. It is a diagnostic: it reveals how much is lost when each catalogue works alone.
Towards a global translation map
The problem will not be resolved by any single platform. It requires collaborative effort between national libraries, metadata standards bodies, and open knowledge communities. It requires that translation—as a category of bibliographic data—be treated with the same seriousness as authorship, publication date, or subject classification.
Making translations visible is an act of cultural infrastructure. It determines which literatures count as existing, which readers can find what they need, and which languages register in the global record. In a world that regularly celebrates bibliodiversity, the least we can do is build the tools to measure it.
When a single cross-referencing project (twenty three sources, sixteen national libraries) finds that nearly ninety per cent of ISBN-verified editions appear in only one source, the scale of disconnection becomes difficult to ignore.
The question is not whether translations exist. They do. The question is whether anyone can find them.
📨Enjoying this blogpost? Sign up to our mailing list and receive all the latest LSE Impact Blog news direct to your inbox 📨
The content generated on this blog is for information purposes only. This Article gives the views and opinions of the authors and does not reflect the views and opinions of the Impact of Social Science blog (the blog), nor of the London School of Economics and Political Science. Please review our comments policy if you have any concerns on posting a comment below.
Image credit: ToninT on Shutterstock.