Exploring AI Hype, Hope, Help, and Harm

This recording is from the Future of Museums Summit held October 29–30, 2024. Explore the transformative potential and practical realities of generative AI in cultural institutions.

Presenters include:

Brett Renfer, Senior Project Manager, Emerging Technologies, The Metropolitan Museum of Art

Bruce Wyman, Consultant, USD Design, MACH Consulting

Liz Neely, Curator of Digital Experience, Georgia O’Keefe Museum

Robert Stein, Chief Information Officer, National Gallery of Art

Keith Knut, Manager, Analytics & Enterprise Architecture, National Gallery of Art

Transcript

Neal Bilow:

Hello everybody. I’m Neal Bilow, CEO of Terentia. I want to take a moment to express our excitement about being a sponsor for this year’s Future of Museums Summit and specifically the AI Adolescents track. We’re thrilled to be supporting the important work of museums and to be part of this inspiring gathering of leaders who are shaping the future of our industry.

At Terentia, we’re passionate about helping cultural institutions thrive in the digital era. Our innovative solutions, including a cutting edge digital asset management system and a collections management system, both are designed to help museums better engage their audience, streamline operations, and make the most of their collections. Our artificial intelligence tools are designed to provide deeper insights from collections and create new opportunities for storytelling. We are here to help museums harness their advancements in ways that elevate their mission and sustain their impact.

The Future of Museums Summit is a crucial space for dialogue on how museums can continue to evolve in today’s rapidly changing environment. At Terentia, we share your vision for preserving cultural heritage while embracing innovation and technological progress. We believe with the right technology, museums can bring history and culture to life in ways that inspire and educate generations to come. As part of this ongoing commitment, I want to personally invite you to visit our virtual booth during the summit.

And while you’re there, don’t miss up the chance to sign up for our upcoming webinar series on responsible AI for Museums. I’ll be joined by two amazing thought leaders, Catherine Devine, the former global head of Microsoft’s Museums and Libraries, and Nick Honeysett from the Balboa Park Online Collaborative, to explore how museums can responsibly leverage AI to enhance their experiences. This will be a three-part webinar series where we go into the insights of how you can use AI and how it’s being done today, along with things that your organizations should be thinking about from a policy standpoint when it comes to the use of AI.

We look forward to engaging with you during the summit and continuing our shared journey to build the future of museums together. Thank you again for the important work you’re all doing. Be sure to stop by our virtual booth and sign up for that webinar series. We’re excited to see you there and keep pushing the boundaries of what’s possible. Have a great summit.

Robert Stein:

Good afternoon, everyone. I am Robert Stein and I’m the chief information officer at the National Gallery of Art. Thank you for joining us this afternoon for our session, which is titled Exploring AI: Hype, Hope, Help and Harm. I’m joined today by a number of great friends and colleagues who are each doing unique and dynamic work with AI in their own organizations and teams.

So I would like to kick off by a little bit of introduction, and Liz, let’s start with you. Would you please introduce yourself to the crowd and say one or two words before we move on to the next panelist?

Liz Neely:

Hi, so happy to be here. I’m Liz Neely. I’m the curator of Digital Experience at the Georgia O’Keeffe Museum here in Santa Fe, New Mexico, and can’t wait to hear more and talk with my colleagues.

Robert Stein:

Thank you, Liz. Brett, could you introduce yourself?

Brett Renfer:

Hi, I am Brett Renfer, senior project manager of Emerging Technologies at The Metropolitan Museum of Art in New York.

Robert Stein:

Wonderful. Keith?

Keith Krut:

Hi, I’m Keith Krut. I lead data strategy at the National Gallery of Art, which includes data science capabilities and these days a lot of AI.

Robert Stein:

Thank you, Keith. And last but not least, Bruce?

Bruce Wyman:

I’m Bruce Wyman. I’m a consultant. I work at a variety of different museums working on museum design, technology strategy and development and exhibit.

Robert Stein:

Awesome. Amy, could we switch to the slides please? Great, thank you.

We’re actually going to start our discussion by a poll right away. We have several polls as part of our session because we want to really engage you in that work. The first one is just getting a sense of where everyone is in their own AI journey. Recognizing that on it, not everyone will have the same experience. So we’re really curious about what your current feelings are about how AI may impact the work of museums in the next five years. Do you think it will be generally helpful, generally harmful, or maybe you think it’s just too soon to tell? Simply respond to this poll and vote. We’ll wait just a second while the poll’s results are coming in and then we’ll move on.

But in the meantime, let me ask Liz. Liz, how would you answer this question?

Liz Neely:

I was probably the first one to answer it. I believe it will be generally helpful. But we have to have all of the pieces in place and do it in service our missions. But in service of our mission, I think it’s going to be really helpful.

Robert Stein:

Thank you. Keith, what do you say?

Keith Krut:

I think there’s a lot of opportunity here and we need to be both careful and creative in how we use it.

Robert Stein:

Thank you. Bruce, where do you land on this topic?

Bruce Wyman:

I’m a firm believer in experimentation that until you actually experiment with a bunch of stuff you don’t truly know if it has been helpful or harmful, even though I’m optimistic and believe it’s going to be generally helpful.

Robert Stein:

Great. And Brett, you want the last word?

Brett Renfer:

Barely. I mean, Bruce’s answer was so good. I feel mostly generally optimistic. I think it’s going to be really helpful. But I think as Bruce noted, it’s sort of too soon to tell exactly what it’s going to be helpful for or with.

Robert Stein:

Okay. And it seems like we’ve got some results from our audience. Audience, you’re a bunch of optimists. Half of you think it will be generally helpful, 6% are pretty much convinced the other way around and a bunch of you are in the middle ground. I think I’m probably with you. I think it’s too soon to tell and it has a lot with how we choose to engage with AI ourselves.

Okay. Heading back to the slides for a second. Our next phase of our presentation, we want to share with you some of the work that each of our panelists have been doing at their own institutions. So we’re going to do a series of rapid fire case studies. Each presenter is going to share with you some way in which their own work has engaged with AI. And in the meantime, I would encourage you as audience members to add your questions into the questions section of the Airmeet platform. So ignore what it says on the slide here. Don’t just use the chat, use the questions i, as moderator will be kind of paying attention to those as we go. And we’ll be able to answer some of your questions at the end of the case studies. So be sure to use that up-vote feature because that will really help me select the things that you’re interested in hearing about.

Brett, I think you are the first one up to bat on case studies.

Brett Renfer:

Thanks, Rob. So I’m going to talk about an experience we developed with OpenAI for our custom institute show spring of this year. So it was a very fast overview. The show was called Sleeping Beauties: Reawakening Fashion, and broadly it was about using technology and all the senses to bring these garments that once they enter the collection are never worn again back to life.

So the show ended with a wedding dress by the designers Callot Soeurs worn by the early 20th century New York socialite, Natalie Potter, about whom we had a trove of research that we uncovered through the course of the project. So on the right you see a screenshot from the web-based chatbot that we built with OpenAI’s support. We let people ask questions about the dress either directly through typing it in or we offered some sample prompts of what they might enter. I’ll talk a little bit later about what kind of things that we fed into the chatbot to start.

We also included an Easter egg. So there’s a public domain image of Natalie Potter actually wearing the dress at her wedding. And so we used OpenAI’s SORA model to reawaken that photo to very on theme for us. And so if you asked what she looked like or said, show me a scene for a wedding or anything kind of fit into that pattern, you got this quick video that started with image in the left and then she kind of turns towards you and then she walks away from the frame. In the gallery essentially the wedding dress is sort of to the right off screen. We put a dynamic QR code and a screen here on the left along with the photo of Natalie. We had a lot of QR codes in the museum and in the exhibition, so it’s important for us to differentiate from that. It’s also quite big so you can scan it from far away.

And then about halfway through the show we also introduced an additional QR code in line. I’ll talk more about this later, but that was super successful, especially building around how people are actually spending time in the museum.

Broadly, what we ended up doing is kind of breaking down the information into different lenses about the dress and then pairing those with different sources of data. So about the dress, about what shoes you wore, how big it was, et cetera, that was really just pulling from our object record. We had a lot about Natalie herself that we compiled from our research. Same thing about the dress’s designers, the Callot Soeurs. We built a mannequin based on a Brancusi sculpture we could talk about. And then we actually made information, I’ll show you this code screenshot information, into a separate text file so we could easily update it over time. We’ll not go through that much of this here, but really just to say you can see a lot on screen. We spent a lot of the work we did was crafting her personality. Yes, they use her a lot, it’s unavoidable, but the rest was about giving all this information up front in a really simple digestible format and putting people onto tracks related to these sort of lenses.

We also built a question limiting feature, which was kind of interesting to… We didn’t end up using it, but as people consider these types of experiences, really important to think about the overall impact in the visitor experience. So we had a chance to change it if it got too crowded within the gallery. We also built the experience to work only on site. So we checked if people were on Wi-Fi or checked their overall GPS coordinates and things like that. If they didn’t, we had fallback pages that… We had actually a kill switch to also, thankfully did not have to use that, but we kind of created these pages and created these different states to accommodate across the different journeys there.

And I’m going to briefly go through some data, have to answer more questions on this at the end. But it’s pretty interesting, it was a very well attended show, about 17,000 people used it over, which is a small portion of the 350-ish thousand people that visited a lot of engagement time. Of those that did interact almost three minutes, which was pretty exciting. And you can see on the bottom left, we more than doubled our usage of the experience when we put it in line, which was amazing. The other stat to call out on the right is 70% of people actually typed in their own question, even though we gave you this really simple and we thought pretty good, straightforward pre-populated ones, but it was pretty interesting to see where that answer.

And then our last insight, we did some semantic modeling of the responses, so we’d actually capture all the things that people wrote in. You can see these colored clusters. Those are the default questions. Everything else is like this huge, almost random splat. You can even see on the right there’s a category of just miscellaneous. Which is kind of frustrating and interesting of itself. We got this thing about a lot of the responses are just all over the place. People kind of talk to it like it was a person, asked a lot about her life and a much smaller portion than we thought of actually tried to break the chat, which is pretty interesting.

So a lot to learn. Happy to answer more questions. But I would say overall it was a really positive experience, and we learned a lot from it.

Robert Stein:

Thank you, Brett. Folks, you’re adding great questions into the chat or into the question section. Please keep going. And Liz, let’s hear about your work at the Georgia O’Keeffe Museum.

Liz Neely:

Thanks so much. If you could move to the next slide, Brett, that’d be great. So just in case you don’t know, the Georgia O’Keeffe Museum, as I mentioned before, is in Santa Fe, New Mexico. We have galleries. We have historic homes. But I’m going to skip to the next slide actually because it’s more pertinent here.

So one of the unique things about being part of a single artist museum is that actually people come to us about all things O’Keeffe, not just about our collection, but about all of the works that this artist has done and how it connects to other stories. So this really drives a lot of what we do and what I’m going to talk about is much more at a visioning future experimental stage than what Brett just showed, but very similar to. So if you can move the next slide.

Because of what people want all things O’Keeffe, we have designed quite a few research products and research data products and most of our focus grouping and most of our input for these have been for other museum professionals, for writers, researchers, scholars. And so what we started thinking about is with generative AI, could we actually broaden the audience of these things? If you can, go to the next slide.

And I’m going to, hat tip to my thought partner Dale Cronkite, who some of you may know, he’s our head of conservation because we really kind of thinking about how might we through generative AI and the data sets that we’ve put together for all things O’Keeffe, reward curiosity by leveraging human desire for dialogue? So that’s kind of our organizing principle here. To start with that, and again, this is very experimental, we just started like, well, can we get some questions from some non-academic and museum professionals? This is a very small sample set; we’ll do a bunch more later. We were able though you can see that we got zero people that self-identified as academics. So if you could just show the next slide, please. And we got a sense for not only how they say the questions, but also what those realms of questions are. So if you could move to the next slide and actually just skip to the next one.

We did some initial experiments. We don’t have programmers on site, so much like people in the chat. This is more of a very small operation. So hopefully that can show that you can experiment even without a tech team. So did Amazon Bedrock, OpenAI with custom GPTs. Next slide. And then Bruce Wyman, thank you, helped us out on some things because our structured data in an unstructured world. So it’s really, a lot of the generative AI is looking at unstructured data and as soon as we added in our structured data, it would for a variety of reasons, lose some things. So experimenting with actually marrying the two together. So next slide. What I really want to get into is actually where we’re looking at this for the future. So sorry about, for some reason this got blurry, but some examples.

So here’s what blues did Georgia O’Keeffe use? So what this really gave us is shows us that it understands an inquiry using non-expert language, which maybe our collections online or other databases don’t quite understand. So that’s something we want to lean into. It responds to according to our instructions, most of the time, here you’ll see that whenever it mentions an artwork, I’ve told it to also say the date of that artwork and the catalog resume number because there are a lot of works called untitled or abstraction. And next slide, please. Great.

And also, it’s augmenting with external resources. So we see here it’s a Winsor Newton Museum of Modern Art, so it’s not limited to what we have pulled together. So this also means that we can augment what we have, but we also are looking at those sources of, well, these are also potential partnerships for expanding this. Next slide. What other artists use similar blues? Again, this is actually pulling, this allows… If you next slide. This allows for follow-up questions. Allows an individual to follow one’s own path of curiosity. So this is what I’m super interested in. How do we reward curiosity by allowing someone to follow that path that they predetermine. We can’t really do that with our current systems; this is a new kind of way that we can interact with people. Next. And it extends that story beyond our focus, which is also great because we are focused on one thing, these artists are all from another section. But you can see that they are canonical. So there’s an area for improvement that we could actually be more inclusive. Next slide.

What the heck is Cerulean blue? Next. If we could see, this allows for clarifying questions without judgment. I’m not going to feel stupid. I’m not asking a curator. I’m not asking someone that if I’m supposed to know. So it really dives in without judgment and, next, allows for those fewer barriers to follow curiosity through the dialogue.

So I’ll wrap it up with the last slide here and just some tips about this experiment with external teams. Publishing good digital content has never been more important because we’re adding to that full set and we can identify where there are gaps. Leverage what we already know about engaging users and determine the safeguards that you need to feel good about dive again. Thank you.

Robert Stein:

Thank you, Liz. Let’s hand over to Keith Krut from the National Gallery of Art.

Keith Krut:

Great. Thanks, Brett. Can we go to the first slide there? So we’ve taken inspiration from NASA’s moon missions and particularly how many invaluable technologies spun off of those initial landings. So we have our moonshots and spinoffs here. I’m going to talk about one of these three. But just at a high level, we’ve been looking at visual description of artworks, understanding visitor feedback and experience through different ways to engage via text and then lastly, the ability to extract data and build essentially graph databases and relationships across our cross-collection subsets. So if anyone’s interested in those other topics, happy to discuss those another time as well.

The visual descriptions work though, just as a bit of framing here. In a nutshell, we developed some guidelines for writing image descriptions here that really helps set a standard and Natanya can post a link in the chat here to where those are available on our website if you’d like to see them. This could be an hour presentation in and of itself. Our lead for this, Lorena Bradford, did really exceptional work in partnership with others and we’ve been working for several years now to write these by hand and produced a couple thousand of them across our website now.

But we wanted to see if we could responsibly accelerate the production of those descriptions so that we could better serve this audience. And we thought maybe we could pair that guideline with AI to help generate faster descriptions. And we’ve been working on this in partnership with Brad and Bruce and some others at some other museums as well. Now how we do that, if we go to… Yeah. Thank you, Brett. So we found that there’s a couple of things that make a really critical difference in this process. So instead of just dropping an image into ChatGPT or some other model and saying, please describe this, there were some things that we’ve developed and tested that make a real difference in how well it can produce a description.

One here is that if we have a multi-step approach or multi-agent design, as you might hear, we can really point the AI in the right direction. Remember, these models are really just predicting the next word. So if we’re going to help them along, we want to kind of give them a push in the right direction. So the first thing that the process does is to categorize an image by way of whether it’s an landscape or a portrait, et cetera, and then apply criteria for that specific type of image in generating an initial draft.

But then there’s another agent that steps in here and refines the first description. So it’s reviewing what it produced itself to sort of weed out certain parts that might be problematic or not the right kind of language. That made a huge difference having that multi-step process, and it’s something that we’re happy to share and would love to learn how others are approaching this as well.

But the second area that made a huge difference was the human review. This is something that is depicted a bit in the top right of this page where we wanted to give a real opportunity by which people could actually review the AI produced content both to refine it as well as to generate a research data set that we could use to feed back into our process and continue to improve it. If you click to the next slide here, Brett. What essentially we’re doing is that as an individual generates an edited response in a viewer app that we’ve created that allows them to edit and save that revised version and evaluate it, we’re then using their evaluation in conjunction with some computer review or quantitative metrics that are gauging the grounded-ness or the relevance and coherence of the GenAI output, but also looking for things like vector similarity between the human generated text, the AI generated text, and the human review text. So that gives us a really good opportunity to hone in on what’s working and what isn’t, to then run into a feedback loop that can help us improve our approach at large.

This research is going to continue as models evolve, the descriptions will evolve and so will the process. So really invite any collaboration in this space for anyone who’d to partner with us on it. I think that’s good for now. If we can go to the next slide.

Robert Stein:

Awesome. Thank you, Keith. Thanks all of you presenters. We’ll have one more case study from Bruce Wyman next. I just want to call out to the audience. You guys are doing a great job of adding some good questions. Thank you so much for your feedback in the chat. That helps us know that we’re connecting with you. So thank you for that. Go and look into questions and up-vote the ones that you want to hear about, we’re heading to that next. But first, Bruce Wyman.

Bruce Wyman:

Thanks. Yeah, I want to cover a couple of different things. I’ve been working with almost everybody that’s on this session and doing different parts behind the scene because part of what I like to do is look into the different ways that we can actually develop this stuff further. What is the next step? And so part of it is I’m a believer in skating to where the puck is going to be, but we don’t know where the puck is going to be, so I skate in many directions at the same time. Brett, next slide.

So one of the things that Keith had just alluded to is the step right now where we have an AI look at itself and try and begin to review stuff and so there’s a natural progression of that. As we’ve generated a number of internally approved descriptions we want to leverage those for the future. And so we’ve been doing what’s called a RAG lookup, which is retrieval augmented generation, in which we have an AI look at the stuff that we’ve done, look at the past, things that we’ve done, learn from those or reference those and then try to see if it wants to influence it. So in this case, we’ve created a multi-step process which begins to look at that and create this as an experimental process. So in this, where we have each step that begins to look at what the artwork is, do a description, compare it to some old stuff, use some fine editing because we like some particular ways to think about texts like if something’s in the distance or the foreground. And then finally, a last attempt to remove subjectivity because it turns out that AI is an enthusiastic first year art history student and we wish to remove that into the end game.

Next slide, Brett. Then we end up creating something like this where we can take all of those initial things and we can reorder them. We can figure out if it makes better sense to do different steps of that further. We get to see what the output is at each step. And then play that all the way through to see do any of these steps actually make a difference? Do they actually help us get where we want to go? Is it important to remove subjectivity earlier in the process or later in the process? And begin to play with that. But I’m a firm believer that experimentation of what we want to do to figure out where we want to go is an important part of that.

Next up, Brett.

So the other part of that is to begin look at all the human slides that we’ve done. I like to go on, let’s call them spirit walks in which we look at and the way to analyze all these different parts of the 1,500 or 2,000 different descriptions that we did. I’ve had AI run through and generate in multiple steps, ways of looking at how do we write these different things? What are the sequence of things that we do? When we use particular phrases how do we go to the next one? Begin to think like, hey, if we begin to look at there’s a pattern of how we write them, how do we begin to teach AI what that pattern begins to look like? It turns out you can go to an absurd level of complexity. This is actually the mid-level of complexity. There’s one that has about three times this amount of stuff that I would never show to the world. But if you like complex images, ping me afterwards and I’m happy to share that with you. Next slide.

So one of the more recent projects has been in collaboration with The Met and a collection of historic exhibit photographs. And part of it is that for their DAMs, they wish to know which times or at what points in the past did the artworks we care about appear in our past exhibits, and we want a way to begin to do that. We have plenty of great images like this. Next slide, please.

And what we really want to do is begin to detect those things that are there. And part of that is looking at different kinds of machine learning models. Again, part of, there’s a lot of experimentation to try and begin to look at stuff. Next slide, please. And so again, create an experimental unit that takes an image of the gallery and there on the left-hand side you can see individually artworks that are detected. Each one of those artworks that are detected then are pulled out as a separate field to query. And we send it to a database. We’ve done a vector database of image embeddings. And yes, I’m going to use a bunch of jargon there. But essentially what it is turning every image that The Met has into a way for a computer to be able to recognize it in the distance.

And so we send all of these images of individual artworks and say, hey, does this look like anything that you know? And we get a set of responses. It comes up with a set of qualified responses of these look more likely, these look less likely, some that I’m not actually sure of. But then, next slide, please, we do end up with an actual set of refined results. So as we begin to look at how we improve that, that works pretty well. But as you look at an image like this, it’s easy to see the pictures that stand out. But we also end up with a lot of things that look like images to the computer system as well. If you look at the bottom of the wall there, you can see individual squares that for a while we ended up detecting that sort of stuff. And so it took us a while to evolve, how do we actually see artworks as opposed to things that are just square, things that look different? Next slide, please.

And so we do some training. We try and say, hey, can we take a bunch of these images and actually begin to outline all the different things that our artworks in those images? And we have the AI begin to learn what those look like. And so on the left-hand side, you can see it trying to go through, let me turn all the things that you’ve outlined upside down, just spin it around, reshape it. And again, then I try and take that and identify things that are on the right-hand side. That’s the way you validate that it actually ends up learning something. Next slide, please.

And so when we run that through, we can start to have it identify individual things. The thing that’s interesting is that you’ll see a bunch of things that are misidentified, and that ends up being okay because what we’re doing is just trying to find things that we want to look up against a database. So even if it thinks that there’s a statue that’s actually a picture of a person in a painting. Next slide, please. The painting still comes back with the right thing from the next slide, Brett, there we go. So look on the left-hand side there, you can see there’s a statue that’s in that painting, but it’s okay. That painting gets identified. We actually do get to see the actual artwork, and it’ll throw away the things that are inside other things and so we can do some calibration against that sort of stuff.

But in essence, we are trying to figure out what are the ways that would be able to look at these things and use machine learning to do that? Brett, I think that’s it. Oh, yes. Obligatory screenshot model. Hey, there’s code and anybody who’s doing development, you can actually get AI to help you write the code. And there you can see on the right-hand side I have a chat window open helping me to treat parts of the individual code. And then the next slide, a shout-out to The Met that has actually made all of their material online available through open access and freely available for anyone to use. Thanks.

Robert Stein:

Awesome. Thank you everybody. Presenters, great job. You guys did awesome covering a lot of ground there. And audience, great job, you guys added some good comments and questions. We’re going to switch to that now. Amy, would you mind bringing us all back to video on the screen please?

Excellent. Hi, folks. So I’m going to try to summarize some of the questions that we’ve seen from the audience. And I’m going to start with one for Brett that received a lot of up-votes, and it is really about a little more of what kinds of documents and historical resources did you feed into OpenAI to make Natalie’s personality work?

Brett Renfer:

Sure. There’s two parts to this question. So for one, I saw one other question around her tone of voice and pieces of that nature. We know that she had a radio show, we tried really hard to actually find her real voice and we worked with their family directly as well. Could not find that. So we really just spent a lot of time trying to make a character that felt real, but not like a cliche. Like when we said, hey, you’re in 1920s New York socialite just by default, it was ridiculous and un-fun. And so we spent a lot of time in that kind of part of the process.

And then the second one on the data point, you’ll hear this a lot from people in general, we have to remember the rules of copyright and how you might write a book and things like that do not disappear when you’re working with AI. So there were some primary source documents that are out of copyright. So we were able to transcribe those and upload those directly. The other ones, and of course things like we own a copyright too, like the object record, we wrote an essay on her for our book, all those things we could put directly into AI. Everything else was original synthesis of the research that we put in. So in the same way, you couldn’t just copy, paste things from Wikipedia and put it in your book. We had to follow that process.

So a lot of that was going to some of the primary sources and, again, writing these new articles and there’s a few pieces we could put in directly. We did do have to transcribe them ourselves because we did. I hope that answers the question.

Robert Stein:

That’s awesome. Thank you, Brett, this is a question that came in for Liz from Jessica Katz. Jessica asks, could you elaborate a little bit more on the types of safeguards or, quote, sidekick features that might be most helpful when incorporating generative AI into public facing tools? How do you balance accuracy and openness in responses?

Liz Neely:

Yeah, and I know we had a lot of talk about bias in the chat, so this is very related to that. And I think that in the experiment… I think there are a few ways that we can do it, and this is something that I’ve kind of want to build a bigger evaluation project around. Hopefully we get a grant and do that. But I think that what we’re thinking about doing is what is the balance between what you tighten in on the data that we give and what do you allow outside, and do you allow all outside in there?

And then what we really are trying to think about is how do we… And I’ll give a shout-out to Kate Haley Goldman, who has been thought partner on this so far, is how do we think about evaluating what comes in and then iterating, finding those gaps, finding where it’s, actually where there is extreme bias, and is it because there’s not enough data out there or because it’s not looking at the right thing or because the prompt isn’t there, and is there a difference between how it’s evaluated by experts and by what is useful to the public? So more to come on that.

But I also think in the future it is really also what I’d like to see is AI checking AI so that it would check it against it in real time as we put out more real information. And I’m sure technical people out there would have better answers, but I think that these are the things that I’m thinking about when we are testing for bias, when we’re trying to, what are the stories that we want to put out there, for accuracy. So I think there are a lot of papers out there that show when we evaluate there’s accuracy. Like yes, those artists that were shown, those all white male artists did use that color of blue, but then there’s another column in that evaluation. But is that the full story? Are we just reinforcing?

So I think there’s a lot more work and people are doing evaluation on these things. I also, as one side note though, I want to say that our museums get it wrong too. So I think that definitely thinking about what are the standards that you want to hold up to, but also not putting ourselves on a pedestal that these things are already happening?

Robert Stein:

Awesome. Thank you, Liz. Next question is for Keith from Caroline Marsh. Hi, Caroline. Caroline’s on our team here at the National Gallery, I’m pretty sure. Caroline asks, Keith, are these image descriptions already up on the website and when they’re used on the website, is there any indication at their AI generated? Keith, there was another question too that asked a little bit about human review process. So if you could touch on those, that’d be awesome.

Keith Krut:

You bet. Yeah, I saw a third one too about why you even have human review. So I’ll try to touch on all three here. Caroline, thanks. The image descriptions, in short, there’s a first batch of them, about 100, that are moving into the website now. So I want to be clear that we are in training wheels mode on this. We’re early on in the process and we’re trying to be very deliberate in how we proceed now so that we can build something sustainable and responsible as we go.

And that human review is important in that first hundred because it’s so new. I mean, really, it’s hard to predict what some of these models are going to produce. So having a chance to check what’s there, for one thing, make sure that we’re avoiding anything that’s obviously problematic in what’s generated. But as importantly, we don’t want to compromise on the quality for this audience. So we want to make sure that we’re providing outstanding descriptions. And in addition, just objectively speaking, the image descriptions are supposed to describe, not interpret. And the AI loves to interpret. You’ll see that a lot in what it produces. So we want to try and make sure that we have a safety net to catch anything that might otherwise go into the site that shouldn’t.

So the human review is important for responsibly producing content as well as for generating research content for us to move forward in the right way. Now on the labeling issue, this is a really good one too, about transparency of when AI has been involved. We’re thinking through a few kinds of labels now, and one would say this has been generated by AI pending further review. So this would be if we were ever to push anything to the site that is exclusively AI generated without human review. We’re not there yet, but we know that we need to have that kind of caveat ready.

The other might say, this is drafted by AI with human review, or this is written with the assistance of AI. We want to make sure that we’re providing the right level of transparency so that folks know what it is that we’re looking at in the site. And this goes beyond the end user review, but also how you even store this information in our systems. We don’t want to just take an AI generated description and drop it into a DAM or our museum collection data without acknowledging that it wasn’t actually written independently. So we’re thinking through a lot of these now. I think, again, these are similar problems that we’re all facing, so the more that we think through them together, the better.

Robert Stein:

Good. Thank you, Keith. Nice answer to the question. Much appreciated. Thank you, Bruce. There’s a question from Sylvina who you are probably well suited to answer. Sylvina asks, what skills, expertise or resources do you need in-house to build these kinds of experiences? Bruce has done an awful lot of hands-on development himself and has worked with all three of the museums on the panel today. So Bruce, I thought this is a good one to toss your way.

Bruce, I’m not hearing you. You might be on mute.

Bruce Wyman:

Yeah, I’m muted so that way you didn’t hear me cough in the background.

Robert Stein:

[inaudible 00:38:24]-

Bruce Wyman:

The most important skill that you want is-

Robert Stein:

[inaudible 00:38:26]

Bruce Wyman:

… curiosity. Yeah, right. No, no. It’s curiosity, right? I’m not a developer by any stretch. I’m a gentleman coder at best. And although I’ve run teams and have a lot of experience in what I want to try and get out of it. What I’ve done over the last year, and a half is I’m almost always working with any of these AI models to say, hey, how do I do this sort of thing? And in some cases, I ask it generate versions of the code for me, and I know how to then do that in a coding environment, get a debug look at the output, and I just kind of keep going through this iterative process. And even when I get errors, I give it to the AI. Sometimes I understand them. Sometimes I do not. I could not pretend to be able to do some of the stuff I do without some of the AI help.

So that’s probably the most important thing is just a willingness to stumble around and be curious and then have an idea of what you’re trying to do in the first place from a human perspective. And I think that gets you well on your way. And then talk to people like us.

Robert Stein:

Yeah, I will say-

Bruce Wyman:

[inaudible 00:39:22]-

Robert Stein:

… that’s maybe my favorite part of the museum sector is that everyone here and pretty much everyone else is willing to share with you all of the things that we’re learning and the things that were getting right and sometimes getting wrong. So I think I speak for the panelists that we’d love to hear from you.

Now we’re going to move on, and I don’t want to forget or neglect that there were a bunch of good questions from the audience about how do you learn more about AI? How do you grow your own skills in this area? And funny enough, that’s a great question because it happens to be the next section of our presentation. And don’t worry, all you policy nerds, there’s policy talk yet to come. But for the next section, let’s focus a little bit on the individual uses of AI, because a lot of engaging with this new technology that seems to be changing every time you look is really thinking about how you can integrate it into your work on a practical and daily basis. That is trickier than it sounds, but luckily, we have some panelists who have experience with this.

While we’re doing that, Amy, I’d like you to bring up our next poll if you could. So the poll is going to really try to get a sense of where you are in terms of using AI tools for your own work at your museum. We recognize that that can really be different for so many of you. So the different answers are, yep, I’m using AI for work on a daily basis, nope, or I’m not really sure that’s even allowed at my institution. And a third kind of open category for those of you who are still playing around with AI but really haven’t figured out what it’s good for. I kind of interested in Liz’s reaction to this initial question because I know Liz to be a great experimenter.

How’s it work for you, Liz?

Liz Neely:

Well, and I think someone brought this up in the chat earlier. I think it is a really good idea generator, whether it’s me alone or me with a group of when we’re thinking of ideas for naming of something or a direction to go with wording, just putting something in there. And we often don’t go with its suggestion, but it gives a lot of ideas and generates things.

But I mean, most of what we’re playing with is also just really as we are making the digital catalog resume, thinking about how data goes in there and what that data is, is that we’ve actually been using it a lot of experimenting just with seeing how it’s interpreting that information and how I can learn what information is in those unstructured documents because then that really tells us where do we need to structure, where do we not need to structure and the importance of that information.

Robert Stein:

Yeah, thank you. Keith, you and I both work at a federal institution and federal agencies are known for being restrictive sometimes. Are you finding ways to play around with AI?

Keith Krut:

I am, yeah. And it’s interesting connecting across the federal agencies on innovations and creative uses there as well. It is fun working at the National Gallery because we get to be part of these two overlapping communities of the federal government and the museum community. And I think here there are some considerations that just help us play with it in a way that we can feel safe about. For instance, creating opportunities to operate in what we call a walled garden. So using tools like those in Azure, OpenAI that don’t have any risk of leakage outside of information that we’re using or in piloting Enterprise OpenAI so that we have a ChatGPT instance that also doesn’t have the risk of leaking outside.

So sometimes that’s just day-to-day productivity of drafting content or testing different kinds of concepts. But coding is also a very big part of this year too, and I think one of the things that we’re going to be looking at is how do you make that as efficient as possible for coders so that it’s not a constant copy and paste game and where it can be integrated into development environments. But so much more to experiment with here as well.

Robert Stein:

Awesome. Well said. Can we go back to the poll results for a second, please? I just wanted to highlight how evenly split this is, right? I think it’s really unusual that when you ask a question in a poll you almost always see that one of the answers captures everyone. But this kind of caught me off guard a little bit that yeah, we’re almost a third, a third, a third between, yes, I use it, no, I don’t and three, I’m still playing around. So that’s a really good marker for where we’re at in the field and why we’re still to figure out how it’s going to integrate and impact museums.

Along with that goes our next poll question. So Amy, if you could bring up the poll. Yeah, that’s right, the one that’s talking about using AI for work. So I want to just recognize that it can feel somewhat subversive using AI if you’ve not gotten clear answers from your institution yet. So we were curious about anonymously whether if you do use AI for work, do you actually talk about it with your colleagues, with your boss? So the first answer is, yep, our museum actively talks about how AI can be used for work. Two is I talk to my close colleagues about this, but my boss doesn’t really know. Or maybe you do use AI and it’s a well-kept secret; you don’t really want anybody to know about that.

While we’re getting those answers coming in, Brett, I want to put you in the hot seat. Where do you think most folks at The Met fall on this scale?

Brett Renfer:

That’s a great question. I mean, we are actively bringing people into the fold of number one. So we’ve been working on, I know this is not to jump ahead, but I’m a part of our policy working group, and we’ve been working on that in parallel with doing large all staff at the director level of what are the opportunities and challenges faced by AI and presented with AI. Two, doing pretty large opportunity and consulting advisory groups like 50 plus people Teams meeting. And then three, smaller breakouts have discussions and get input on our policy. So we are almost a 2,000 person org plus 1,000 volunteers.

So I don’t think we have 100% coverage, but we are really actively trying to use all these different types of forums, both to kind of share what we think are good opportunities and guardrails but also get people’s inputs and answer questions. I mean people, because AI is a very broad term, we had one person be like, oh, I used a chatbot five years ago on Expedia to book a flight and it booked me the wrong flight, and I’ll never use AI again. And you’re like, wow, there’s a lot of things to unpack there. But people have those experiences which are super valid. So I think that’s the thing to think about too.

Robert Stein:

That’s great. Amy, could we see the poll results on that last question? Okay, well, you know what? That’s encouraging. Considering that you were all a third, a third, a third on the previous example, I sort of anticipated that this one might be split too. But it seems like most of you are actively talking about how AI can be used for work. So that’s great.

Bruce, how does this strike you think’s up with this compared to the prior poll, which saw things more evenly split?

Bruce Wyman:

I think it’s still a case of there’s a lot of places that are doing experimentation just beginning to get their feet wet because the thing is, there’s always a concern, and I just wrote this in the chat, is that AI is going to replace functions, and I think that’s wrong, right? It’s an amplifier of the things that you do. And so everybody’s trying to say, hey, does this help with the stuff that I need to do? And Liz gave the example earlier of either thinking of different names for stuff or generating a bunch of stuff. And that’s where I find it most commonly useful for a lot of stuff is just generate 20 ideas about this thing. It still requires a human then to qualify, is it good?

So it doesn’t surprise me that a lot of museums are actively thinking about that because I think that’s the right thing to do before you decide, oh no, this is horrible for us, it is anathema to everything that we do, and we must wait for it to die. Or it’s an opportunity that you leverage and go into the future.

Robert Stein:

That’s great. Thank you. Amy, let’s go back to just the faces for the moment as we move on to the last two topics of our conversation. The first exciting topic is that of policy. I think as everyone is talking about AI, not everyone’s sure they’ve figured out exactly what to do with it. Only a few of you are keeping it a secret, but I believe that pretty much all of our institutions are trying to figure out what to do with it. Luckily, a couple of us have been thinking about policy and what that looks like. No big surprise that maybe the federal government has been thinking about policy.

Keith, would you mind sharing with people a little bit about what’s in the National Gallery of Arts’ AI acceptable use policy?

Keith Krut:

Sure. And I’ll preamble that by saying the first version of this is already outdated. It was outdated within several weeks of the first draft. So keeping these as living policies is critical. I’d also preamble it by saying that our first draft was many pages, and our executive team looked at it and said, are you kidding me? You think people are going to read and understand this? We need to have a simple way to communicate this. So the policy and training need to go hand in hand. And I’m encouraged by how often I’m seeing that message right now that AI literacy needs to be a key part of how we implement this in our organizations.

Now, the policy that we have drafted and that’s currently communicated says a few things. For one thing, there’s good examples of what to do and what not to do. So it tries to translate it into some specific day-to-day activities. Some of the themes that are recurrent there are around, please don’t use any non-public information in public tools. So if there’s anything that would not ordinarily be something you’d want to show up in The Washington Post or somewhere else that we don’t want to take any risk of entering that into something public and showing up later, as has happened in some other cases we’ve read about in the papers.

Other aspects of this get to the kinds of labeling and rights information and so on and trying to provide some guidance in that regard. And we’re really trying to steer people as well towards using enterprise level tools. So that it’s a message of sort of experiment and learn, but also do that in a controlled environment. I’m sure I’m missing some parts of that. Rob, any others that come to mind?

Robert Stein:

Well, I kind of am curious, Brett and Liz, whether the O’Keeffe Museum or The Metropolitan have landed on an AI policy for your teams’ works as well?

Brett Renfer:

Yeah, we have, and ours is coming out soon internally, which is great. Based a lot on NGA’s policy, of course, as leaders in this space. I mean, I just wanted to plus one the thing that Keith said, a lot of it is thinking about what are the baselines, what are the things that are particular to the institution and then really synthesizing that use case. So we ended up making this gradient actually of acceptable, the unacceptable that also has an ingredient of disclosing versus non disclosing. We’re kind of like; you don’t have to say this was written with autocorrect and put that in your email. However, if you’re pumping out a translation directly into something which right now we don’t do, you have to reveal that. So I think that’s been a lot of it is working through all these use cases and then setting up a structure.

I mean, I think Keith talked about this too, of we’re going to do a quarterly review. But also really starting, where are the issue spaces? And for us it is that idea of data guidelines. Saying we are going to have a better policy for audience facing things, but they’re kind of like ad hoc and we might surprise people, but The Met is not the fastest at putting out cutting edge for sort of experiences. So usually there’s enough time built in that we can work through those things. However, the internal use is the one that we’re really working on putting those guardrails on because of those are things because we’re so big, mistakes can happen and go out.

So it’s been great, but a lot of it’s just been starting small with a small group and then growing and growing, growing the number of people who have eyes on it and feedback and are working through the details.

Robert Stein:

Yeah. Thanks, Brett.

Liz Neely:

[inaudible 00:52:46] was a much smaller place we just have are much less formal and so there just hasn’t been as much work around it. It’s just that we probably should do more. But definitely I send out reminders just saying, hey, if you put something in here… Because we do know that everyone is using it in different ways or a lot of people are using it different ways, they’ve talked to me about it. So just making sure that if you’re using, again, as you mentioned, a public tool, it’s not one that you’re buying a team account where presumably it does not get shared with the models. I don’t know if I believe it. But yeah, we’ve just are very much more informal and just reminding people what’s actually happening.

Robert Stein:

Right. I do want to kind of add, sometimes in policy land it’s worth stating the obvious. So we have some lines in our policy that just says, don’t use AI if you think it’s going to hurt people or harm property. It seems trivial, but if you think about security systems or fire alarms or door locks. So don’t do that. Another obvious one that you wouldn’t think you need to say, but we should really say it, is don’t use AI if you believe it will impact the rights of people. And so all of us, I think, struggle with hiring and finding great people. I’m going to recommend that you not use AI to screen applicant data. We all know that AI is based on content that it scrapes from the internet, that that content is biased. And in the land of AI, not the wisest, but the loudest voice wins. So that is not a good way to hire staff. So if you think that AI is going to impact rights, then don’t use it.

I think the idea of gradients is useful. Not every museum will end up in exactly the same place. And so, one technique I’ve used for our executive team is giving them a set of sliders. So where it is using AI in the public, what’s your level of comfort? Really uncomfortable to very comfortable. Unsupervised AI, very uncomfortable to very comfortable. And if nothing else, that technique of using sliders gives everybody a vote and opens a conversation. It’s been very effective. AI is a perfect place to use this technique.

One additional thing that is specific to us and different museums will have different outputs is that we’ve decided that we’re not going to use AI to make images that purport to be a work of art. For us, for our values, where we stand as a national gallery, we’re not convinced that the AI models are doing the right thing ethically by artists. And we think the risk of fakes and determining authenticity is problematic right now. So we may change that in the future, but for right now we’re not doing that. But I think it’s fair to leave space that your museum might differently.

Let’s pivot to the last thing. A lot of questions on how do you learn, how do you stay up to date with these things that are changing? Let’s, in our last two minutes, just do a quick roundabout. Bruce, how do you follow things? How do you stay up to date?

Bruce Wyman:

There’s this massive hire fire hose that I kind of put my face in front of every day. There are a variety of different news services. There’s a bunch of different websites, it’s articles, I have a bunch of RSS feeds to kind of go through different things. And then it really does go back to curiosity. I find oftentimes there’s a topic that I’m actually curious about and that I end up going chasing that thread and that’s the way that I actually find out stuff. There’s this invariably this wonderful thing that one thing leads to the next, and that’s how I ended up doing a lot of discovery. And then also talking to peers, I mean ends up being genuinely helpful. I think it was from Brett that I actually came up with the original model I was going to end up using for doing the image detection stuff. So it ends up being useful. It turns out we’re all useful to each other, against all odds.

Robert Stein:

Amazing.

Bruce Wyman:

It is. It’s weird.

Robert Stein:

How about Keith?

Keith Krut:

Yeah. Always will second curiosity, Bruce. I love that point. Whether it’s in what you’re reading or who you’re talking to. I mean, I’d say there’s a question that I’ll have, say for instance, who’s thinking through training on different local languages and diversifying the languages models they’re trained on, I wish someone was looking at this. And then I’ll go to a forum where I hear about somebody and I talk to that somebody, they point me to somebody else who’s doing a base of research around that subject. So just being able to find those connections are often the product of conversations.

But I would say with the reading side of it as well, that I’d really recommend both sort of short form and long form reading. There’s a lot of great distilleries, so to speak, where you can go to TLDR or Medium, et cetera and get the latest news, but it’s also really helpful to read the longer books that are happening that offer a chance for deeper reflection and some of the harder problems. And the kind of groups that are organizing around this right now are a great opportunity as well for all of us. AI4LAM is doing some really great things right now, they have an awesome literally called an Awesome AI Resource Page. And then events like this one, MCN, et cetera. So great to make those connections.

Robert Stein:

Awesome. Thank you. Brent and Liz, for the sake of time, I’m going to wrap us up. Thanks to the panelists, thank you to AAM for giving us this forum. And most of all, thanks to the panelists for asking great questions. I’m going to take the liberty of saying for the panel that you can find us all on LinkedIn, we’d be happy to answer your questions and connect with you. So I hope you’re having a great afternoon. Thanks a lot.

Liz Neely:

Thank you.

American Alliance of Museums

Exploring AI Hype, Hope, Help, and Harm

Transcript

AAM Member-Only Content

We're Sorry

Upcoming Events

AASLH History Hour: Civic Practice

What It’s Like to Participate in Museums Advocacy Day

From Idea to Acceptance: How to Submit a Winning Annual Meeting Session Proposal

Age is Not a Limitation: What Happened When We Took Older Adults Seriously as Creators – OMA Webinar

Leave a Reply Cancel reply

AAM Member-Only Content

We're Sorry

Latest Stories from AAM

How to respond to new threats to nonprofits

Take Action to Protect the Independence of the Smithsonian Institution and America’s Museums

AAM Statement on the Independence of the Smithsonian and America’s Museums

Announcing the AAM 2027 Theme

Subscribe to our Newsletter!

Success! Now check your email to confirm your subscription, and please add communications@aam-us.org to your safe sender list.

Share this article

Author

Transcript

Author

AAM Member-Only Content

We're Sorry

Leave a Reply Cancel reply

AAM Member-Only Content

We're Sorry

Subscribe to our Newsletter!

Success! Now check your email to confirm your subscription, and please add communications@aam-us.org to your safe sender list.