Museums, Cataloging & Content Infrastructure: An Interview with Kenneth Hamma

by David Green, Knowledge Culture

Ken Hamma is a digital pioneer in the global museum community. A classics scholar, Hamma joined the Getty Trust in 1987 as Associate Curator of Antiquities for the Getty Museum. He has since had a number of roles there, including Assistant Director for Collections Information at the Getty Museum, Senior Advisor to the President for Information Policy and his current position, Executive Director for Digital Policy and Initiatives at the Getty Trust.

David Green: Ken, you are in a good position to describe the evolution of digital initiatives at the Getty Trust as you’ve moved through its structure. How have digital initiatives been defined at the Getty and how are they faring at the institutional level as a whole, as the stakes and benefits of full involvement appear to be getting higher?
Ken Hamma: Being or becoming digital as short-hand for the thousands of changes institutions like this go through as they adopt new information and communication technologies has long been discussed at the Getty from the point of view of the technology. And it did once seem that applying technology was merely doing the same things with different tools when, in fact, we were starting to embark upon completely new opportunities. It also once seemed that the technology would be the most expensive part. Now we’ve learned it’s not. It’s content, development and maintenance, staff training, and change management that are the expensive bits.

About 1990 it seemed to me (without realizing the impact it would cause) that it was the Getty’s mission that would and should largely drive investments in becoming digital. That it would require someone from the program side of the house to take more than a passing interest in it. I know that sounds impossibly obvious, but it wasn’t nearly so twenty years ago when computers were seen by many as merely expensive typewriters and the potential of the network wasn’t seen yet at all. Needless to say, the interim has been one long learning curve with risks taken, mistakes made, and both successes and failures along the way. Now, we’ve just got to the point at the Getty where–with a modicum of good will–we can think across all programs with some shared sense of value for the future. We now have a working document outlining the scope and some of the issues for digital policy development at the institution that would cover things like the stewardship and the dissemination of scholarship, digital preservation and funding similar activities elsewhere. Within this scope, we’ll be considering our priorities, the costs and risks involved, and some specific issues such as intellectual property and scholarship, partnerships and what kind of leadership role there might be for the Getty.

Do you see the Getty, or some other entity, managing to lead a project that might pull museums together on some of these issues?
There’s only a certain amount that can be done from inside one institution and there are some fundamental changes that can’t be made and probably need to be made. One of the big problems about technology is its cost. For so many institutions it’s still just too expensive and too difficult. There’s a very high entry barrier–software license and maintenance fees as well as technology staff, infrastructure development and professional services–in short, the full cost of owning technology. The result isn’t just a management problem for museums but an opportunity cost. We’re falling behind as a community by not fully participating in the online information environment.

There was a technology survey in 2004 of museums and libraries that pointed out that although small museums and public libraries had made dramatic progress since 2001, they still lagged behind their larger counterparts.[1] While almost two-thirds of museums reported having some technology funding in the previous year, 60% said current funding did not meet technology needs and 66% had insufficiently skilled staff to support all their technology activities. This problem is complicated by a gap between museums’ community responsibilities and the interests of the commercial museum software providers–notably the vendors’ complete disinterest in creating solutions for contributing to aggregate image collections. There was a similar gap between library missions and OPAC (Online Public Access Catalog) software until OCLC grew to fill that gap in the 1980s.

Can you imagine any kind of a blue-sky solution to this?
Well, imagine a foundation, for example, that took it upon itself to develop and license collection-management and collection-cataloging software as open source applications for institutional and individual collectors. It might manage the software as an integrated suite of web applications along with centralized data storage and other required infrastructure at a single point for the whole museum community. This would allow centralized infrastructure and services costs to be distributed across a large number of participating institutions rather than being repeated, as is the case today, at every institution. Museums could have the benefits of good cataloging and collection management at a level greater than most currently enjoy and at a cost less than probably any individual currently supports.

Managing this as a nonprofit service model could create cataloging and collection management opportunities that are not just faster, better and cheaper, but also imbued with a broader vision for what collecting institutions can do, both individually and as a community in a digital environment. If we could do this by providing open source applications as well as web services, it would build value for the community rather than secure market advantage for a software vendor. A service model like this could also assume much of the burden of dealing with highly variable to non-existent data contributions that have plagued previous attempts to aggregate art museum data. And I think it could do it by supplying consistent metadata largely by enabling more easily accessible and better cataloging tools.[2] This problem of aggregating museum data has a relatively long history and its persistence suggests that though current schemes are certainly more successful, what the community needs is a more systemic approach. One of the problems is that there just isn’t a lot of good museum data out there to be aggregated. So talking about what it would be like to have aggregated repositories other than those that are hugely expensive and highly managed (like ARTstor), it’s unlikely to happen anytime soon. There’s not enough there there to aggregate with good results.

Cataloging seems to be the key to this future, as far as museums’ resources are concerned. Would this scenario would be a first step in producing some good common cataloging?
Well, yes. It’s not enough to say to institutions, “You have to be standards-compliant, you have to use thesauri, you have to use standards, you have to do this and do that.” There are a lot of institutions that aren’t doing anything and aren’t going to do things that are more expensive and time consuming. So it’s not going to help to say that collection managers should be doing this. They’re just not going to do it unless its easier and cheaper, or unless there an obvious payoff and there isn’t one of those in the short term.

So such a project, if it were ever undertaken, would be about providing infrastructure, about providing tools?
Yes, as well as thinking about how we maintain those tools and how we provide services. Because most cultural heritage institutions don’t have IT departments and probably never will, how can we think about sharing what’s usually thought of as internal infrastructure? I mean, choose a small museum with a staff of three; you can’t say ‘you can’t have a finance guy because you need IT,’ or ‘you can’t have a director because you need to do cataloging.’ That’s just not going to happen.

There’s a related model that you have been working on that provides a technical solution both to cataloging and to distribution. If I’m right, it’s not about creating a single aggregated resource but rather about enabling others to create a range of different sources of aggregated content, all using metadata harvesting.
Yes, it’s still in its formative stages, but the essential idea is to put together a system that is lightweight, easily implemented by small institutions, doesn’t require huge cataloging overhead and that supports resource discovery. A problem today is that if you wanted to ask for, say, an online list of all Italian paintings west of the Mississippi, that presupposes that all collections with an Italian painting are participating. But we’re so far from that. It’s the rich and well-funded that continue to be visible and the others are largely invisible. So can we come up with a protocol and a data set that would allow for easy resource discovery that would have a low bar for cataloging and metadata production for unique works?

In this project, we’ve gone through a few rounds now, using the recently developed CDWA Lite as the data standard, mapping that to the Dublin Core in the Open Archives Initiative Protocol for Metadata Harvesting (OAIPMH). Dublin Core, as we’ve all learned, is a bit too generic so we’ve applied some domain knowledge to it and have additionally included URL references for images. We’ve collaborated with ARTstor and have done a harvesting round with them. Getty’s paintings collection is in ARTstor not because we wrote it all to DVD and mailed it to New York, but because ARTstor harvested it from our servers. Just imagine we get to the point where all collections can be at least CDWA-Litely cataloged–say just nine fields for resource discovery. Then these can be made available through an exchange protocol like OAIPMH and then interested parties such as an ARTstor (who might even host an OAI server so not every collecting institution has to do that) could harvest them. If we could get that far and we imagine that other aggregators like OCLC might aggregate the metadata even if they didn’t want the images, it could be completely open. The network would support collection access sharing and harvesting that would be limited only by the extent of the network. Any institution (or private collector) could make works available to the network so any aggregator could collect it. A slide librarian at a college, with desktop harvesting tools, could search, discover and gather high-quality images and metadata for educational use by the teachers in that school. Or perhaps intermediate aggregators would do this with value-added services like organizing image sets for Art 101 at a cost that might suggest a different end-user model.

How far away is this from happening?
The protocol exists and will likely very shortly be improved with the availability of OAI-ORE. The data set exists but is still under discussion. That will hopefully be concluded in the next months. And the data standards exist, along with cross collection guides, like CCO, that’s Cataloging Cultural Objects, on using them. The tools should not be too hard to create. The problem again is the institutional one, the usual one when we’re talking about content. Most museums are not willing to enter into such an open environment because they will want to know who is harvesting their collection. It’s the reaction that’s usually summed up by “we’re not sure we can let our images out.” These are those expected nineteenth-century attitudes about protecting content along with the late twentieth-century attitudes that have been foisted on the museum community about “the great digital potential”–generating revenue based on that content as long as they control it and don’t make it accessible. How sad.

The recent NSF/JISC Cyberscholarship Report[3] discusses the importance of content as infrastructure, and how any cyberscholarship in a particular discipline is grounded until that part of cyberinfrastructure is in place. Museums are clearly far behind in creating any such content infrastructure out of their resources. What will it take to get museums to contribute more actively to such an image content infrastructure? Is there a museum organization that could coordinate this or will it take a larger coordinating structure? Will museums be able to do this together or will they need some outside stimulus?
If it isn’t simply a matter of waiting for the next generation, I don’t really know. It would really be helpful if there were, for example, a museum association in this country that had been thoughtfully bringing these issues to the attention of the museum community, but it hasn’t been true for the last twenty years. And museums are different from the library community with respect to content-as-cyberinfrastructure in that they’re always dealing with unique works. This changes two things: first, one museum can’t substitute a work in the content infrastructure for another one (the way in which a library can supply a book that another library cannot); and, secondly, for these unique works there’s a much greater sense of them as property (“its mine”). This, in a traditional mindset, raises the antenna for wanting to be a gatekeeper, not just to the work but even to information about it. You can see this in museums talking about revenue based on images of the works in their collections, or the need for museums to be watching over “the proper use” (whatever that is) of their images. Not that we don’t need to be mindful of many things like appropriate use of works under copyright. So there is still the sense that there’s got to be something (financial) gained from these objects that are “mine,” whereas most of these collections are supported by public dollars and there must be some public responsibility to make them freely available.

‘You’ve talked elsewhere about the “gatekeeper” mentality among many museum professionals, perhaps especially curators. How do you imagine the forward trajectory of this? How will this gatekeeper mentality play out?
Yes, it’s been very frustrating, but I think it is changing. Even over the past few years I think there’s been significant change in how people think about their gatekeeper role. Today–different from ten years ago–I would say curators are less and less gatekeepers, and directors are being caught off-guard by curators proposing greater openness of the sort that will take advantage of network potential. The Victoria & Albert Museum, the Metropolitan Museum and others are now making images available royalty-free for academic publishing.[4] And along with these changes there is a change in the tenor of the discussion. We want to keep the conversation going as much as possible in hopes that we can move toward a world where objects, especially those in the public domain, can become more fluid in this environment. Many of the attitudes toward intellectual property can be summed up in focusing more on maintaining appropriate attribution for work rather than asserting “ownership,” rather than saying, “it’s mine, you have to pay me for it.” If we’re honest we have to admit that there’s really not a lot of money in the whole system around these kinds of resources. In fact, the real value of these items lies in their availability, their availability for various audiences but especially for continued scholarship and creativity.

That’s a good point. Not too long ago the Yale art historian Robert Nelson said in an interview here that whatever is available online is what will be used, what will create the new canon. He made the analogy to JSTOR. In teaching he notices that the articles he cites that are in JSTOR are the ones that get read; the others don’t.
Yes, that’s absolutely true. And it will take one museum or one major collecting institution to have the imagination to see that and to see that in addition to people coming into the gallery for a curated exhibition, that this other experience of network availability and use has extraordinary value. And if there were two or three really big collections available, literally available as high-quality public domain images, not licensed in any way, one could imagine there would be significant change in attitudes pretty quickly.

You’ve described the open quality of the digital environment as threatening to many in institutions. Could you elaborate a little on that?
The extent to which the opportunities bundled here for realizing mission in non-profits are perceived as threats derives largely from confusing traditional practice with the purpose of the institution. The perception of threats, I think, clearly has been decreasing over the last few years as we become more comfortable with changes (perhaps this is due to generational shift, I don’t know). It is decreasing also as we continue with wide ranging discussions about those traditional practices, which were well suited to business two decades ago but act as inappropriately blunt instruments in the digital environment. These include, for example, the use of copyright to hold public domain hostage in collecting institutions; notions of “appropriate” cataloging, especially for large volume collections that are more suited to a slower paced physical access than they are to the fluidity of a digital environments; and assumptions that place-based mission continues alone or would be in some way diminished by generous and less mediated online access.

In your ACLS testimony back in 2004 on the challenges for creating and adopting cyberinfrastructure, you argue that the most important work for us all ahead is not the technology or data structures but the social element: the human and institutional infrastructure. Is this the weakest link in the chain?
I’m not sure that I would still describe institutions and people as the weakest link, but rather as the least developed relative to technology and the opportunities it brings. This too seems to have changed since the start of the work of the ACLS Commission. We can do plenty with the technology we now have on hand but we’ve frequently lacked the vision or will to do so. One of the most startling examples of this became visible several years ago when the Getty Foundation (the Grant Program) was awarding grants under the Electronic Cataloging Initiative. Many Los Angeles institutions received planning and implementation grants with varied results. One of the most successful would have been predicted by no one other, I suppose, than the hard-working and ingenious staff who are employed there namely, the Pacific Asia Museum. Greater than average success from an institution with, to all appearances, less capacity and fewer resources than other participants was not based on access to better software or on an IT manager who would only accept a platinum support package. It was based on the will and the imagination of staff and the institution.

So would you cite that museum as one that is successfully redefining itself for a digital world?
Yes. You know, there are lots of museums that are doing really good work, but it’s going to take time and the results will show up eventually. If all the effort over the next ten years or so is informed by more open attitudes about making content more available–seeing content as cyberinfrastructure–then it will be all the better. It really is a question of attitude in institutions and a willingness to see opportunities. Almost never believe “we haven’t got the money to do it.” In scholarly communication there are millions of dollars going into print publications that, for example, have a print run of several hundred, for heaven’s sake. You just need to take money out of that system and put it into a much more efficient online publication or collection access system.

It’s about attitude and a willingness to invest effort. The Pacific Asia Museum is a good example. It doesn’t have the budget of the other large institutions in LA and yet it was among the most successful in taking advantage of this opportunity from the Getty’s electronic cataloging initiative. They were very clear about the fact that they wanted to create a digital surrogate of everything in their collection, do some decent cataloging and documentation and make it available. That just sounds so perfectly obvious. But that there are so many institutions that don’t seem to get something so basic, that don’t understand some aspect of that, is just completely astounding to me.
NOTES 

[1] Status of Technology and Digitization in the Nation’s Museums and Libraries (Washington, DC: Institute of Museum and Library Services, 2006),http://www.imls.gov/publications/TechDig05/index.htm.

[2] Recent aggregating efforts include ARTstor and, in recent history, AMICO, both of which look back to the Getty’s Museum Educational Site Licensing project and the earliest attempt to coordinate art museum data at the point of cataloging in Museum Prototype software from the Getty Art History Information Program.

[3] William Y. Arms and Ronald L. Larsen, The Future Of Scholarly Communication: Building The Infrastructure For Cyberscholarship, report of a workshop held in Phoenix, Arizona, April 17-19, 2007,http://www.sis.pitt.edu/~repwkshop/NSF-JISC-report.pdf.

[4] See Martin Bailey, “V&A to scrap academic reproduction fees,” The Art Newspaper 175 (Nov 30, 2006), http://www.theartnewspaper.com/article01.asp?id=525.; The Metropolitan Museum, “Metropolitan Museum and ARTstor Announce Pioneering Initiative to Provide Digital Images to Scholars at No Charge,” press release, March 12, 2007; and Sarah Blick, “A New Movement to Scrap Copyright Fees for Scholarly Reproduction of Images? Hooray for the V & A!,” Peregrinations 2, no. 2 (2007), http://peregrinations.kenyon.edu/vol2-2/Discoveries/Blick.pdf.