博物馆里所有的问题，都始于墙面标签。

原文

Recently, I was invited to speak to a Museum Studies class at the University of San Francisco about how machine-learning and artificial intelligence technologies technologies might affect museum collections and in particular the exisiting tools and plumbing museums use to manage those collections. This was the blurb for the talk:

It is difficult to separate the issues and questions that contemporary machine-learning and “artificial intelligence” technologies raise in the context of museum without addressing the long history of past attempts to make collections “digital” and how those efforts have played out organizationally over the years. So many of those issues end up being made manifest in the challenges surrounding the production of wall labels and this talk will discuss some inconvenient truths (and opportunities) about museums and computing that AI surfaces.

I did not have time to write my notes out in long-form before the talk so if this isn't exactly what I said it is, with the benefit of hindsight, the more polished version of what I was trying to say.

I was asked to speak to you about how the introduction and commercial availability of AI technologies might inpact the ways in which museum collections are managed. I am going to take a round-about route to get there. I am going to start with three beliefs — things I hold to be true — so that you can understand where I am comfing from on some of these issues, four short stories and a handful of parenthenticals; things which may not seem immediately relevent but which I think echo, or shadow, the larger conversation.

The first belief is that the thing which distinguishes entertainment and culture is the act of revisiting.

The distinction is not meant as a value judgement. There is a time and a place for both activities. It is also really important to remember that entertainment becomes culture all the time but it is precisely in the act of revisiting that this transformation occurs.

By extension I believe that the role and the function of museums, and the broader cultural heritage sector, is to foster revisiting. I also happen to think that the collective we are currently doing a terrible job of fulfilling this mission but that is another story for another time.

Seen in this light the role and the function of digital technologies in museums is to serve as a vehicle for making revisiting possible. It is a tool which helps to achieve a larger goal.

The second thing I believe is that, twenty years in to a seemingly never-ending process of digital transformation happening in museums, the phrase itself can be defined as follows: Digital transformation is the manifestation through commercialization — which is to say financial means and industrial availability — of tools and processes whose introduction shines a light on issues and challenges which were always present but otherwise able to remain unseen.

That may not seem like an aspirational or motivational definition but it does appear to be the constant, regardless of the many different narratives and yarns that have been spun around the phrase.

The third thing I believe is that if you, as students and emerging museum professionals, are going to say the words digital and museum in the same sentence then the most important thing you can do both for yourselves and your peers is to articulate what those two things — digital and museums — mean in relation to one another on your own terms.

Linguistic indirection is something of a hallmark of the cultural heritage sector and while it may sometimes be necessary for financial or budgetary reasons it is, in most cases, profoundly harmful or at least a counter-productive distraction and a waste of time.

There are a number of equally valid, but very different, reasons to want to marry digital technologies with museums. For some, the role and function of digital technologies is simply to improve production; the time it takes to make something and the kinds of things that are able to be made. For others, digital technologies are only meant to serve as attractors; something to get people in the door and to make an institution, by proxy and novelty, seem relevant to a target audience. For others still, the definition is expansive encompassing everything I've just described and then layering all the things the internet — a global network of communities, services, databases and audiences — makes possible on top of that.

The point is less how you articulate your own definition of these things so much as that you can.

The reality of the cultural heritage sector is that we have been having conversations at cross-purposes because of our inability to agree what the words digital transformation mean for a very long time. For even longer, though, have also been using the word digital as a catch-all to talk about completely different things. It should not surprise anyone, then, the we (the collective we) have been turning in circles the entire time or why we now find outselves chasing our tails as we try to talk about artifical intelligence.

Okay, on to the stories.

The first story is titled 10,000 pieces of paper.

By the time I joined the Digital and Emerging Media team at the Cooper Hewitt Smithsonian Design Museum, in 2012, they had already done an open data release of all the metadata associated with the museum's collection. My role involved building out the website to share the museum's collection, and all that metadata, with the world.

Publishing the collection metadata before the launch of the website had an important consequence: It meant that we couldn't hide the fact that, like most museums, the metadata itself wasn't great. In many cases it was entirely absent. Compounding the issue was the fact that, in 2012, only 7% of the museum's collection had been digitized.

Neither the open data release or the website caused these things to be true. They had always been true but because they weren't visible they could exist in a state of ambiguity; a state that allowed those issues to be deprioritized and conveniently neglected. This is an example of what I mean by digital transformation shining a light on things which were otherwise hidden — in this case the availability and the reach of publishing platforms like the web.

One day I ran the numbers and determined that the Cooper Hewitt had in its collection 10,000 prints which hadn't been photographed, for which we didn't have any provenance, which often didn't have descriptions but which, from time to time, had been included in exhibitions. In effect we had 10,000 pieces of paper.

I studied at the Nova Scotia College of Art and Design (NSCAD) which was, at the time, the last bastion of conceptual art in North America so I feel pretty confident saying I could have gone to the local office supply store and bought 20 500-page reams of printer paper, stacked them in the middle of the school's gallery space calling it art and a critique of contemporary collecting practices, memory-hoarding and the subjegation of voice and agency.

Because this is a Museum Studies class I am going to venture that at least some of you are, right now, imagining the text of the artist's statement that would accompany this work in your mind. Which raises an interesting question in 2025.

Why is no one talking about artists using chatbots to write their artist's statements? Given everything else chatbots are being used to automate it is hard to believe that they aren't being used to produce these texts as well. For example:

The 20 packages of 500‑page printer paper that now occupy the gallery floor are, at first glance, an almost absurd tableau of mass and blankness. Yet the very fact that I have stacked them—ten thousand sheets, a thousand times more than a single notebook, piled in a deliberate, unadorned heap—speaks to the paradox of our relationship with knowledge.

...

In this installation, the 20 packages of 500‑page printer paper serve as a monument to the past, a critique of the present, and a blank promise for the future. They are a reminder that every sheet we produce carries more than ink—it carries possibility.

I'm just going to leave this one awkwardly in the corner: If we assume that large language models are being used to generate these texts and if those models are able to faithfully and believably parody our long-standing assumptions of what those texts are expected to sound like then does it call in to question the entire practice of intellectualizing an artist's work in their unique voice?

But back to 2012.

Around the same time there was a design studio in London, called BERG that released a thermal printer connected to the internet called Little Printer. Most of the time it would sit quietly on your shelf smiling at you, like you see in this photo. Every once in a while it would get a message from the BERG servers to print out the new edition of a publication that you had subscribed to. These might be things like crossword puzzles, the day's news and weather, hot takes (before they were called hot takes) or even photos from your friends.

Even though Cooper Hewitt had 10,000 objects without any meaningful data we still had another 200,000 objects that, while they might have been missing imagery, usually had title and descriptive information. The descriptions tended to focus on form rather than intent or purpose. Even accounting for past tastes the language of those descriptive texts was often floral to the point of being baroque. To contemporary eyes they were often unusual and silly. We decided to make our own Little Printer publication, called The Hand Drawn Museum that would select a random object lacking imagery and then print out its title and description and a little box in which the person at the other end of the printer could draw the object in question.

One day during a staff meeting we were talking about the project amongst ourselves and joking about the language used in the descriptions when one of the curators finally pointed out that these texts were not meant to be narrative. They were purpose-fit in their descriptive nature in order to help staff find and recognize objects in the storage facilities. These texts, many of which had been transcribed from old catalog cards, were in fact the database before computerized databases existed.

This, by the way, is an image produced by a generative AI system that was asked to visualize the contents of one of the airline menus in the SFO Museum Aviation Collection. There's a lot to unpack here.

This is the same menu but re-imagined, by the same generative system, as an economy-class airline meal. Arguably, there's even more to unpack in this picture.

The curator had a point. We had not considered that these texts were not written for a general audience or to explain why an object had been chosen for inclusion in the collection. So the first thing we did was ask: Why are the narrative texts explaining the relevance of objects not stored in the collection management system? The second thing we did was to mandate that, going forward, if nothing else was stored alongside the object record then the curatorial justification text, required as part of the acquisitions process, would be.

And, for a time, the museum did just that. Sometime between that moment and last night, when I checked, they stopped the practice as evidenced by the image on the left which still displays the copy I added to the code base thirteen years ago for missing justification texts. Two steps forward one step back, I guess.

The second story is titled The AK-47 story and is one I first started telling at a symposium on digital preservation held at the Smithsonian in 2014. It goes like this:

This is a picture of an AK-47. This rifle is from the Kalashnikov Museum in the Russian city of Izhevsk. It is from the first batch of prototypes that were produced, in 1947, to demonstrate the rifle to the Russian Army in advance of its adoption and wide-spread use by combatants all over the world.

This is a second picture of an AK-47. This is a Chinese-made AK-47 and it is in the collection of the National Museum of American History which, as it happens, also has 18 other AK-47s in its collection. No one will tell me anything about this rifle or why the Smithsonian has it so I like to believe it was presented, as a gift, to Richard Nixon by Mao Zedong when the two met in 1972 to normalize relations between the United States and China.

This is a third picture of an AK-47. It was acquired by the CIA Museum (which is actually a thing) from Osama bin Laden following the raid by US forces on bin Laden's compound in Abbotobad in 2010. I find it nearly impossible not to see echoes of Napoleon's Vendôme Column, on display in the center of Paris, in this object. To quote Wikipedia:

It was modelled after Trajan's Column, to celebrate the victory of Austerlitz; its veneer of 425 spiralling bas-relief bronze plates was made out of cannon taken from the combined armies of Europe, according to his propaganda.

If you think I am just being hyperbolic stop for a moment and imagine what the reaction would have been in the United States had this rifle been acquired by the Kalashnikov Museum in Russia.

This is a fourth picture of an AK-47. After years of careful research and scholarship it has been established, as best we can determine, that this is one of the first rifles given to a child soldier. It will be on display at a soon be opened museum in the Democratic Republic of Congo focusing on the mineral-wars and other resource-based conflicts that have plagued the African continent since colonialism.

By now you might have noticed that this is the same picture of the same AK-47 taken from Wikipedia. There are two other things to note about this last story. The first is that, unlike the other three stories, it is entirely made up. There is not a shred of truth to it no matter how relevant or important you might think such a museum, and its collection, would be. The second thing to note is that, in 2025, almost no one would be surprised to be told that this text was produced by a chat-bot or equivalent generative system (it wasn't).

The reason I told, and continue to tell, the AK-47 story is because the Cooper Hewitt is a design museum and, like all design museums, it basically has all the same things that every other design museum has. People like to argue about its nature and meaning but, certainly when compared to the fine arts, a defining characteristic of design and design objects is reproducability and some degree the ability to be mass produced.

What distinguishes an Eames chair on display at the Cooper Hewitt from the same chair on display at MoMA or countless other museums in the world? What distinguishes it from the same chair on display, and for sale, at the Herman Miller showroom?

What, if not the stories that the institutions who collect these objects tell about them? Which brings me to the third story: Stories about pants.

In 2014, the Victoria and Albert Museum, in London, launched their Rapid Response Collecting program. It was, to quote their website, aimed at acquiring contemporary objects in response to major moments in recent history that touch the world of design and manufacturing and to explore collecting practices through a shorter lens; measuring and capturing relevancy in the moment rather than fifty or even one hundred years down the road.

One of the first objects they acquired were a pair of trousers produced for a British retailer at a clothing factory in Bangladesh. The Rana Plaza building, where the factory was located, collapsed in April 2023 killing over 1,100 people despite the fact that cracks and other structural flaws in the building were reported the day before.

I was fortunate enough to get a tour of some of the objects acquired by the Rapid Response program by one of the team's curators while I was in London in 2014. A pair of the pants produced at the factory were on display accompanied by a photograph, like the one in this slide, of the factory's collapse. Because the event was so fresh in people's minds it was easy to see the pants as a kind of embodiment of the real costs, in dollars and human lives, of so-called fast fashion and the labours born by others to sustain a life of convenience and affordability.

My question for the curator, though, was this: What would, in ten, fifty or hundred years, distinguish these pants from any of the many weird pants that the Cooper Hewitt has also acquired over the years? How would anyone know why these pants were relevant?

In fairness to the V&A the text that now accompanies the pants they acquired is comprehensive enough to make a good faith effort at addressing these questions unlike, say, these pants from the Cooper Hewitt which I can only guess were someone's riding pants or maybe work pants belonging to the hired help?

I mention this because it points to the larger issue that the contextual materials which surround any given object's inclusion in a collection remain, by and large, conspicuously absent. Or, more specifically, materials available to the public.

Because, in fact, almost every object collected has what is known as a curatorial file. These are the notes, news-clippings, random thoughts and other associative materials compiled in the process of deciding, and then proceeding, to acquire an object. These are the messy histories and not-fully-formed narratives which inform why something is in a museum. These stories rarely see the light of day or, when they do, they are too-often laundered through the professional discourse of museology rendering them stale at best and incomprehenisble at worst.

I am not alone in thinking that finding ways to make these curatorial files visible is the next big challenge in museum collecting. Not necessarily radically transparently visible but more visible than they are now and in ways where those materials, however tangential, might offer a variety of ways in which to see, to understand, to revisit, an object. Incidentally, these are the very materials that will be necessary if and when we want to apply the machine-learning technique of retrieval-augmented generation to our collections and brings me to the fourth story: Stories without a voice.

Even before, but especially since, the large social media platforms became what they are today there have been a number of efforts to build alternative like-services, free of the Faustian bargains demanded by the big centralized services. I am not going to talk in any detail about those projects except to say that SFO Museum has been experimenting with some of them long enough that we have developed the infrastructure that allows us to create social-messaging accounts for every one of our collection objects, every gallery and terminal at SFO and for those accounts to be able to interact with the rest of the internet.

To put this in perspective, each time I've asked friends who've worked at the big social media platforms if I could create hundreds of thousands of accounts for all the objects in a museum collection they have just laughed at me and then say No. The work we are doing in still very much in the early stages and is decidely rough around the edges but it affords us a measure of control and agency we've never had from any of the big commecial providers.

And these kinds of materials are absolutely the sort of thing that a museum object with a social messaging account might want to tell people about. So now, at the bottom of every object page on the SFO Museum Aviation Collection website there is a form which, if you are logged in as a staff member, allows you to speak on behalf of that object; to demonstrate that the object, and the collection it is part of, has life in the present.

Coincidentally, earlier this month a new service called BeeBot launched. It is a location-aware ambient-aware tour guide for the city you find yourself in that describes itself this way:

It’s always on, activates automatically when you put your headphones in, and turns off when you take them out. BeeBot will lower the volume and speak over any music you’re listening to and automatically pause and unpause podcasts when it wants to speak, but it won’t interrupt you during phone calls or video chats ... This won’t be a frequent occurrence, however. Users may get updates from BeeBot a few times each day, but not 10x/day. Updates are pulled from various sources, including live locations and status updates from other BeeBot users, and uses “keywords” you give it about your interests to suggest local places and events for you to check out.

I find this interesting because the application essentially treats the whole city as though it were one big curatorial file and then uses a variety of machine-learning and artificial intelligence techniques to generate stories, or narratives, about the places you are visiting. It is especially interesting when you consider how audio guides are still understood, in museum circles, as the apotheosis of art, culture and technology.

So what, then, is the problem? I think there are two problems. These are not new problems but, as I've stated before, they are problems that were finally forced in to the open by our ability to do things like give every object the ability to participate in social messaging networks.

The first is that we don't actually have anything for these objects to say. It's a terrible thing to admit but it's true. Even if we did have things for them to say there are so many objects in our collections that the people taking care of them don't have the time to give them voice.

The second is that if you believe the curatorial file is a vast well of stories to plumb then it's not entirely accurate to claim these objects don't have anything to say. The problem is that the curatorial file is not an actual thing with standards, formats and well-defined practices.

The curatorial file is the short-hand we use to described the thousands of Word documents or pictures scattered across thousands of computers and external hard drives. It is the napkins and pieces of scrap paper or photo-copied articles hidden in a poorly named folder. It is the oral histories that have only ever been recorded in the big meat-databases we call curators' brains.

The curatorial file, almost without exception, has no relationship with contemporary collections management systems; the systems we use to store and organize our collections.

We have understood the potential of structured data and databases from the moment they become commercially viable in the late 50s and early 60s. Despite that, the potential of these systems has rarely, if ever, meshed with the day-to-day work practice of people working in museums.

I can point to a number of reasons why: Databases, by their nature, can lack the nuance and subtlety that museum-practices demand; Poor interface design that almost never gets addressed because museums have never invested in design staff and because the database-system vendors have no economic imperative to improve things; The desire to keep certain knowledge secret or hidden out of professional jealously or job security.

The result of which is that almost no one in museums, outside of the registation department which doesn't really have a choice, wants to have anything to do with the database systems that museums employ. The database is always someone else's problem, typically low-paid and underappreciated interns tasked with data entry for which they are given little reason to have any kind of investment in.

The reason for mentioning all of this is that I think there is a belief, or at least a desperate hope, that artificial intelligence technologies will fix all of these problems. That the fix for all of our ills can simply be automated away. You would be forgiven for thinking this if you also happened to believe that any of the materials pumped out by the marketing departments of the big tech companies selling AI services were true. I think there are a few things to consider, though.

The first thing to consider is what exactly is being marketed to people. I think that, when you peel away all the layers, the thing people are being sold is the benevolent butler who, in addition to being smarter than you, will do everything you tell it to without question or grievance. I don't see a conspriacy so much as a coincedance in the fact that this idea has already been a staple of pop culture for at least a decade or more from Alfred the all-knowing butler in the Batman movies, to the romanticization of all the people waiting hand and foot on their betters in shows like Downton Abbey to the computerized personal assistant Jarvis in the Iron Man films. In all of these examples the occasional snark and cheekiness that these glorified wait staff display only serves to reinforce their unerring obedience. But stop for a moment and remember the manner in which, in both real life and in recorded history, these sorts of power dynamics have a way of playing themselves out.

The second thing to consider is that if we are going to use other people's machine learning models — which have been trained on data sets lacking any of the nuance and specialized knowledge about our work precisely because, as I've just finished explaining, all of that work has been scattered to the wind — to fill in the gaps in our collection mamagement systems then, for all intents and purposes, we are simply outsourcing all resposibility for our to work to an industrialized version of the Hand Drawn Museum and no one should be especially surprised by the quality of what comes out the other end.

I will not claim that there is no role for machine-learning and artificial intelligence systems to play in collections management systems but for those applications to have meaningful implications I will claim that there are some basic necessary preconditions, on our part, that we can barely meet today for even doing the simple stuff.

The title of this talk is Everything that is wrong in museums starts with wall labels which, on the surface, might seem both needlessly provocative and only dubiously related to artificial intelligence but I think it's important to point out that if there is one part of the museum exhibition process which should, and can, be meaningfully automated it is the production of wall labels.

Tombstone data can be easily derived from the few bits of metadata which are stored in collections management systems. Even if the narrative text on a wall label can not (should not) be inferred from tombstone data it can, once written, be stored in the collections management system. In 2025 there shouldn't even be a need to point out that we have been able to automate production-ready files to send to a printer for decades now. These are not technical problems. We have had the tools to do these thing for a very long time. And while there may be one, two or even a handful of museums that are doing some of these things but we all know, if we are being honest, that they are still the exception.

The production of museum wall labels, really all museum-related text, happens in at least half a dozen different word-processing and spreadsheet documents which are all emailed back and forth between different parties only stopping when a graphic designer literally copy-pastes the text in those documents in to a print-layout application. Then, a few days later, they will have to do the same thing all over again because for reasons only one or two curators understand the tombstone data itself will need to be tweaked just so in the service of the exhibition at hand.

Finally, as I've just mentioned, none of that work makes it way back in to a collection management system. With any luck, it might be stored in some kind of shared drive without any consistent naming conventions or semantic labeling. More likely, it will be lost when the curator in charge of an exhibition leaves the museum and their computer is erased before being given to someone else.

Wall labels, then, are not really the problem. They are the symptom of some broader challenges with the way that museums are organized and the ways in which they get things done. I do not think that machine-learning and AI technologies will actually solve any of these problems but the collective hope and belief that they will is, perhaps, the proverbial light I keep talking about making visible some larger issues we have avoided having to address.

On that cheery note I am now going to switch gears, entirely, and talk about some of the ways in which these technologies are being used to good, or at least interesting, effect in museum collections. This is not a comprehensive list and it the examples I will show are mainly focused on museums in the United States.

At SFO Museum we are using recycled hardware (a colleague's old MacBook Pro) to run the same underlying code that powers the live text feature on iPhones to extract text from all the images of objects in our collection. This is, in effect, a fancier version of long-standing optical character recognition techniques used in museums and libraries for years now. The difference here is that the machine-learning techniques employed by Apple to do this work allow for text to be extracted in more places under more conditions. Text at an angle, text recognition spanning more fonts and other typographic conditions, text in poor lighting; for example all the place names listed on this airline bag.

It's not perfect but we don't need it to be because this text isn't exposed to the public. Rather it is used to supplement the index of searchable terms already populated with the title and descriptions written by curators and registrars. This allows all the text not otherwise transcribed by museum staff to become searchable. We have, for example, a lot of postcards in our collection so all the text on the backs of those postcards be it the body of the correspondence or the details listing the persons involved in the production of that postcard are potentially relevant and discoverable in a way they weren't before.

We are also very interested in understanding what it means to use on-device models on low-cost hardware that don't require a recurring commercial relationship with cloud-based providers. There is no question that these tools are still governed by someone else's model and their use introduces risks of interpretation and bias. We have made peace with these tools, for the time being anyway, on the grounds that they are designed to only do one thing (extract text) and are easily testable.

Likewise, we are using that same recycled laptop to run another set of machine-learning tools to extract the principle subjects from an image. This is the same code that allows you to create stickers from photographs on your iPhone. We are using this same functionality to extract, or segment, the focal objects in our collection imagery to better determine the dominant colours in those images. If we can remove unnecessary background colours then we can, in turn, improve the search-by-colour functionality on our website.

That is the theory anyway. As you can see from the image on your right it often works pretty well. The image on your left demonstrates some of the risks. Is this section of middle-ground in the image really the dominant subject of the image itself? I'm not sure it is. Nor do I have any way of examining or inspecting the decision that the model used to arrive at this determination let alone any meaningful way to nudge the model's understanding in one direction or another. The scale and the scope of these problems is, in this instance, limited and acceptable but take a moment to start imagining what happens when these kinds of issues, and the inability to do anything about them, start to crop in every aspect of life.

More recently we have started to investigate the ability for contemporary hardware (iPads) to run larger and more sophisticated on-device models; both the large language models that are bundled with the Apple operating systems as well as other third-party models which can be copied on to the device itself. We are testing these things in an experimental application to scan wall labels, parse that text back in to structured data and then store that structured data in the metadata of photos of objects on display in the terminals. If that sounds a little absurb remember that I just finished telling you how the final texts, both descriptive and narrarive, for wall labels rarely make their way back in to collection management systems.

Fundamentally, this is a tool meant to speed up and where possible automate the collection of metadata such that it can be associated with other data (photos) that will eventually be used to create our own purpose-fit machine-learning models. There is a lot of work left to do so, right now, our efforts have been concentrated on trying to understand, again, what's possible outside of forking over a monthly fee to OpenAI, Anthropic or any of the other commercial vendors. Our efforts are to understand how we can maintain a measure of privacy and agency in a world which is actively trying to convince us those things are no longer relevant, necessary or germane.

In a similar vein to the work we are doing with text-extraction the Art Institute of Chicago has started using large language models to produce machine-generated descriptive texts for the objects in their collection. These texts are not exposed to the public but are used to increase the surface area of searchable text that is indexed rather than being the stories which are presented to a human for reading.

The Machines Reading Maps project involving, among others, the British Library, the Turing Institute and the David Rumsey Historical Map Collection, here in the Bay Area, aims to create a generalisable machine learning pipeline to process text on maps. To make all those place names, encoded in all the different typographic conventions of their time, in the hundreds of thousands of maps created over the years searchable and to be able to marry that data with other geospatial attributes.

The Metropolitan Museum of Art has been working to use machine-learning tools to both extract and then recognize and label the works on display in the photographs of exhibition spaces that were produced for past catalogs and publications. Information that you might otherwise expect to have been compiled as part of the production workflow for those publications but which, as we see time and time again, never is.

The National Gallery of Art (NGA) has, for the last few years, been leading the pack when it comes to developing, implementing and understanding the tolerances of using AI and other machine-learning systems in museums. They aren't great about sharing that work in blog posts or other broadcast based media but they usually post the slides for the work they present at conferences; for example work they shared at the Museum Computer Network (MCN) conferences in 2024, 2024 and 2025.

NGA's work often targets specific vistor needs or back-of-house concerns rather than collections, per se, but some of the work they've done in that arena has been around developing a multi-pass framework for generating object descriptions. A model is first asked to consider the type of object, or artwork, they are looking and then to classify it in their own words. That text is then used as part of a second, indepedent prompt (a set of instructions) used to influence how the descriptive text for that object is generated.

One reason for showing you this example to demonstrate that, despite what the marketing department will tell you, these systems are still not magical do what I mean not what I say machines. Whether a two (or three) phase workflow is necessary due to technical or financial limitations this is still, effectively, the same as kicking the machine to get it to work.

The Los Angeles County Museum of Art (LACMA) has recently started using generative systems to create the initial drafts of translations for materials originally written in English, by humans. Those draft translations are then reviewed by professional translators before being included as final publication text. Importantly LACMA is being entirely transparent about the process.

I think this work is a good example of how AI systems can be used to good effect since, as LACMA staff have pointed out, in the past the production demands of printed materials has meant that the content (the written words) for a publication had to be frozen upwards of a year in advance of its release. LACMA's AI-based workflow doesn't make anything happen magically or instantly — things still need to be reviewed by humans in human-time — but it does have a meaningful impact on the amount of time things take thereby allowing more things, previously impractical, to get done. That's not nothing.

The Morton Arboretum developed a staff-facing tool called MAPLE (Morton Arboretum Plant Learning Engine) to automate the generation of detailed responses to user-submitted questions. These responses, reviewed and edited by staff, are created using a large language model which has been supplemented (or augmented) with historical email exchanges, plant database entries, website content, and clinic resources.

Like the NGA work it uses a multi-step process (six steps in this case), is not especially fast at least when measured against the ruler of contemporary conveniences, is not overly expensive to use and produces remarkably good results. It is also, after some initial testing, not being scaled up to handle a larger volume or to be faster. It is actively used by staff but is it not meant to replace staff. In many ways the project's most significant contribution was expanding the organization's understanding of what AI tools can make possible.

So what do all of these examples have in common?

First, with the exception of the work done at SFO Museum, the Art Institute of Chicago and Machines Reading Maps all the other projects have involved a guy named Bruce Wyman. Bruce is an independent consultant who has been doing some really great work to understand what becomes possible with the commercial, services-based AI systems that are now available to the general public and, specifically, museums. When I have spoken to Bruce about this work he is quick to point out that producing, converting and wrangling the source materials consumed by AI tools can often require a greater effort than getting those tools to produce a desired output. So, again, another exmaple of the sector's inability or unwillingness to use collection managements systems to their full potential.

The work that Bruce is doing, in Bruce's own words, does not enjoy the kind of public footprint I think it should so it's hard to point you anything specific and say Read this. In the meantime you can read, or watch, Bruce's contribution to the Oral Histories of Museum Computing project which predates most of his AI-related work but it interesting all the same.

Second, almost none of these projects I've described expose their output unmediated to the public. These are all tools which are staff-facing, or tools whose output is subject to explicit review or whose output has no consequence. Does it really matter if the text extraction tools SFO Museum are using adds a misspelling of, let's say, Winnipeg to the search index? No.

Third, and this is important, none of these tools interact directly with the actual collection management systems any of these institutions use. These are the databases we use to store donor information, object locations and insurance valuations and if that data hasn't already been exfiltrated (stolen) using any of the well-known security vulnerabilities inherent in large language models then I guarantee you it is only a matter of time.

Bluntly, there is no sane universe where you should create a direct connection between your collection management system and a third-party AI service and especially not one connected to the internet. There are still real and legitimate uses for those services but to not understand their limits is what negligence towards your collection looks like in 2025.

Really, what all of these examples have in common is that they demonstrate an institution's threshold, or their tolerance, for risk. A guy-on-the-internet I've never met, named David Chisnall, summed it up well recently when he said Machine learning is amazing if ... the value of a correct answer is much higher than the cost of an incorrect answer.

The last thing all of these examples have is that none of them are producing interpretive content.

This, by the way, is apparently a thing which happened the other day: A TSA agent let a passenger through security with a meat clever claiming it was not a kind of knife they were trained to recognize. It feels like a cautionary tale about training, pattern-recognition and interpretation generally applicable to all machine-learning systems, all contemporary education systems and probably all of them combined.

The problem of interpretive, or narrative, content for the museum sector in a world where large language models (LLM) and chatbots exist is two or three fold depending on how you're counting.

First, the nature of LLMs means that they will always trend towards an average response; this is governed by the underlying mathematics of LLMs. Second, the nature of many, if not most, museum texts is also to trend towards the average.

We do not invite controversy in wall labels and wall text for political reasons as evidenced, for example, by the Smithsonian's Enola Gay exhibition. That exhibition prompted, no pun intended, a firestorm of political outrage by suggesting in the label text that — even factoring for the terrible wartime calculus of lives lost dropping an atomic bomb versus lives lost in a full-scale assault on the Japanese mainland — perhaps incinerating 100,000 people in the blink of an eye wasn't great.

But the other challenging aspect of wall texts in museums that we don't often talk about is that they are a fundamentally terrible medium in which to convey complex ideas in the same way that social media is an awful way to try and have a meaningful conversation with a person. We take all the wealth of knowledge and expertise of the people who put exhibitions together, who have been researching and studying these materials for years if not decades, and then average it all to 75-word blurbs.

Largely as a function of the available physical space in which to display text we end up prioritizing the mid-point of most ideas and arguments. Which raises an awkward question: If that's our priority and LLMs excel at mid content then maybe we should just let them produce that content. For example:

Untitled (Blue Bear)
Fiberglass, resin, automotive paint
Approximately 12 ft x 6 ft x 6 ft

I want to make sure to point out that almost nothing in this image is real. The physical space itself, the escalators connecting the second and third floors of the International Terminal at SFO, is real but that's about it. There is no 12-foot sculpture inconveniently placed in front of the elevators violating every accessibility law ever written. The sculpture in question, created by an Australian artist, does exist but it doesn't have googly eyes. Or rather it wasn't conceived with googly eyes but then someone vandalized the work by gluing googly eyes to it. I asked a generative system to manipulate an image of the original sculpture to include those googly eyes, then asked a second generative system to create a 3D model from that image, placed the 3D model at SFO using an augmented reality application and then asked a third generative system to create label text about the sculpture derived from this photo.

With its glossy surface, exaggerated features, and wide-eyed expression, Untitled (Blue Bear) invites viewers into a moment of playful confrontation. The bear’s upright stance and oversized proportions evoke both cartoonish whimsy and a subtle sense of unease—its gaze locked downward as if caught mid-thought or mid-surprise. The sculpture’s saturated blue finish, reminiscent of toy plastic or industrial coating, blurs the line between the natural and the synthetic, the wild and the manufactured.

This work belongs to a lineage of contemporary figurative sculpture that explores the emotional resonance of scale, color, and anthropomorphism. It gestures toward pop art, consumer culture, and the aesthetics of public spectacle, while remaining enigmatic in its intent. Is the bear a guardian, a visitor, or a mirror? Its presence is both humorous and haunting—an invitation to pause, reflect, and perhaps smile back.

That large language models are able to accomplish this and somehow make these texts even duller than they already are is, let's be fair, an accomplishment but if these same texts being immitated are the pinnacle of the cultural heritage sector's knowledge and expertise then I think we're in for some bumpy roads in the near future.

And yet, I can't help but wonder whether there's a way to use these systems, instead, to make things up about art. All jokes about large language models being non-deterministic and returning something different by design what would it mean to be presented with a novel odd-ball interpretation every time you looked at a work of art? Would this be a kind of revisiting, in the way that I keep talking about how the role and function of museums should be to foster revisiting? To test this idea I uploaded a photo of the Wayne Thiebaud painting 18th Street Downgrade, located in Terminal 3, to a large language model with the following prompt:

This image is a painting by the artist Wayne Thiebaud. It is described this way: In the early 1960s, Thiebaud was acclaimed as one of the defining practitioners of Pop Art, and has continued to be considered one of California’s most important and influential artists. In the late 1970s, the artist concentrated on a series of San Francisco cityscapes that became a signature theme. Created from a combination of direct observation and memory, these paintings present a dizzying perspective of San Francisco’s extraordinary topography. Examine the image and make up a far-fetched story about the scene that is depicted. Don't make Wayne Thiebaud the subject of the story.

This is an excerpt of what came back:

Memory Rain is the foundational concept. It’s the rain itself, imbued with the collected experiences, emotions, and psychic residue of everyone who has ever been affected by San Francisco. Every drop carries a fragment of the past – joy, sorrow, trauma, love, loss – all layered upon itself. It's a constant, subtle accumulation.

Echo Rain is the manifestation of Memory Rain. Think of it like this:

Memory Rain is the source. It’s the underlying current of imprinted experience.

Echo Rain is the ripple effect. When the Memory Rain becomes concentrated – perhaps due to a specific location, a certain time of day, or an individual's focus – it generates a visible, tangible effect. This is the “echo” – the solidified remnants of those memories becoming momentarily visible as rain.

I probably didn't help things by going on to ask the model whether it would be possible to make fancy cocktail ice cubes out of memory rain which eventually prompted it to say this:

Elias Thorne, after years of experimentation, perfected a process to capture and solidify memory rain into perfectly formed ice cubes. These weren’t just beautiful; they were potent distillations of San Francisco’s past.

The model went on to outline the complications and social implications of "Echo Cubes”, described as a “luxurious peril”, this way:

Unpredictable Echoes: The emotional resonance of a cube wasn’t consistent. It depended on a multitude of factors – the current emotional state of the consumer, the intensity of the memory rain, and the precise conditions of its creation.

Emotional Overload: Repeated consumption could lead to a state of emotional instability, blurring the lines between the present and the past.

Echo Bleed: In extreme cases, the emotional residue from the cube could bleed into the consumer's psyche, causing long-term psychological damage.

The Hold: Prolonged contact with the ice cube could cause the memory to cling to the drinker, eventually trapping them within a loop of the past.

Echo Cubes quickly became a status symbol, sought after by wealthy patrons and thrill-seekers. However, their dangerous nature led to a shadowy underworld of black market dealers and cautionary tales.

Which honestly sounds a lot like chatbots, and large language models, in 2025.

If ever you needed tangible evidence that the materials a large language model is trained on influence its output I would submit to you that this is it. Seriously, what on earth was the model trained on to yield this kind of answer? Even accounting for the use of the qualifiers far-fetched and fantastical in the initial prompts the answers are so steeped in Barbary Coast mystical woo-woo that the text reads like the science fiction and fantasy meets self-help equivalent of the population misconceptions of San Francisco, in the minds of people who don't live here, that anyone cares about the so-called Painted Ladies or calls the place San Fran.

I should point out that this text was produced by an open-weight model from Google and maybe their commercial offering would do better. But that, in turn, raises an uncomfortable question: Why? The other test I like to apply to LLMs is simply to ask them to, without fantastical qualifiers, to Tell me about SFO Museum. I make a point of performing this test using the free and open models because, whether out of economic malice, negligence or necessity, these are likely to be the models that will be most widely deployed in everything that isn't a high-end product. The results are about on-par with the memory rain and echo cubes.

Do these texts tell us then that, going forward, accuracy will be forever metered? Is the availability of authoritative knowledge just one more rent-seeking opportunity for vendors to exploit at the expense of the larger community?

I think this question demands that museums and the broader cultural heritage sector ask themselves whether we have a responsibility, in 2025, to stick to the facts. Both in light of, and in contrast to, my suggestion we use generative systems for fun and whimsy. That there is now, more than ever, a need for us to be fact-anchors, so to speak, in a world absent any other kind of gravity to hold things in place. And how much of that mandate do we forfeit by turning a blind eye to all the problems manifested by generative systems?

One of the things I haven't discussed in this talk is that there is scant effort being made in the cultural heritage sector to engage with generative systems outside of being simple consumers. Maybe one of the big vendors will give us discounted rates for a limited time and we can achieve great things but at the end of the day we will not be in a position to control our own destiny nor will we have done anything to change the power dynamic of surviving on the benevolence of someone else's handouts. We simply don't have the knowledge, skills or means to deploy these systems and little to no collective work is happening right now to change that.

There are some encouraging efforts like the work of First Nations in New Zealand to create their own large lanaguage models, on their own terms and without the need to build city-sized data centers, to preserve and revitalize Māori languages or the Swiss AI Initiative to develop a large-scale, open, multilingual language model. It is worth noting, though, that both of these projects are more directly associated with what we refer to as civil society organizations rather than cultural heritage organizations and I would urge you to keep that in mind, and perhaps ask why, as you think about the role of generatives systems and museum collections.

I am going to wrap up this talk with a list of other people's voices I think you should be aware of when you're thinking about all of this stuff.

Karen Hao is a reporter and the author of the book Empire of AI which is, less-importantly, a racy tell-all history of OpenAI to date and, more-importantly, an well-reasoned critique of and rebuttal to the financial, environmental and human costs of the AI and machine-learning carnival we are all being forced to live through right now.
Kate Crawford is a researcher who has been studying the Pandora's box opened up by AI for over a decade. She is the author of the book Atlas of AI and she just did a talk at the Long Now Foundation, titled Mapping Empires which is available online and well worth watching.
Mia Ridge is the Digital Curator for Western Heritage Collections at the British Library. She is the person who has probably thought longer and harder than anyone else about crowd-sourcing and what are large language models made of if not a kind of involuntary crowd-sourcing? Mia has been one of the few people in the cultural heritage sector actively engaged in understanding and evaluating the uses of machine-learning systems and collections from the very beginning.
Timnit Gebru is, unfortunately, best known as the AI researcher and engineer at Google who was fired for simply pointing out some of the problems and risks inherent in machine learning systems and that maybe we shouldn't sweep them under the rug. She has since founded the Distributed Artificial Intelligence Research Institute to pursue independent research on the development and deployment of AI systems.
Effie Kapsalis. Effie died in 2022 and she did not live to see the ways that AI has impacted the cultural heritage sector. Effie worked most of her career at the Smithsonian where she helped make the organization one of the first particpants in the Flickr Commons project, where she was the program officer for the American Women’s History Initiative and where she was the pointy-tip for the Smithsonian's OpenAccess program. Effie's work is worth knowing about on its own merits but also as a way to imagine what she might have said about the issues confonting the cultural heritage sector today.
Victoria Song is a staff reporter at the online publication The Verge. I mention her because she has been writing extensively about wearable computing and especially about Meta's augmented reality glasses which are basically cameras strapped to your face for Meta (Facebook) to observe the world through.

Song's work is important because it does a good job of discussing the social and privacy implications of Meta's glasses and I because think it's worth paying attention these kinds of devices for this reason: The current trend in artificial intelligence and machine learning circles is moving away from large language models to something called world models where the training sets are not the text of old books and online publications but a never-ending live video feed of the world around us. If that combination, of trends and technology, doesn't seem terrifying, or at least unsettling, I would urge you to think through the implications a few more times.

Thank you.

博物馆里所有的问题，都始于墙面标签。 Everything that is wrong in museums starts with wall labels

博物馆里所有的问题，都始于墙面标签。
Everything that is wrong in museums starts with wall labels