From Data to Wisdom: Humanities Research and Online Content

by Michael Lesk, Rutgers University

1. Introduction
President Clinton’s 1998 State of the Union Address called for “an America where every child can stretch a hand across a keyboard and reach every book ever written, every painting ever painted, every symphony ever composed.”[1] If that dream is realized, we would have the resources for all humanistic research online. What difference would that make for the humanities?

The widespread availability of online data has already changed scholarship in many fields, and in scientific research the entire paradigm is changing, with experiments now being conducted before rather than after hypotheses are proposed, simply because of the massive amounts of available data. But what is likely to happen in the humanities? So far, much of the work on “cyberinfrastructure” in the humanities has been about accumulating data. This is an essential part of the process, to be sure, but one would like to see more research on the resulting files. At the moment, we have a library that accumulates more books than are read. T. S. Eliot wrote “Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?”[2] Modern computer scientists have tended to turn this into a four-stage process, from data to information to knowledge to wisdom. We are mostly still at the stage of having only data.

Important humanities projects already use textual databases, including authorship studies dating back more than forty years. Soon, however, nearly every published work will be searchable online. Not only will this solve the problem of locating sources but, more importantly, it gives us the ability in textual studies to automate queries about, for example, the spread of stylistic techniques, the links between authors, and the references to cultural icons in books. Today, we are making the twenty-first-century equivalent of scholarly editions, with the ability to also conduct research on them.

Simply locating important physical items is a major gain from the collection of digital data. Art historians are unlikely to write significant essays based on thumbnail images, but they can use thumbnails to decide which museums to visit. Humanists are more likely than natural scientists to want to see a specific original object (medical researchers reading about a toxicity study in mice are unlikely to ask to see the actual rodents). Humanists’ need to consult manuscripts, paintings or a composer’s autograph scores will remain and this reinforces the value of online library catalogs and archival descriptions, even if the subjects themselves are not available remotely over the Web.

The new communications technologies, especially the spread of broadband Internet service, will improve our ability to do both interdisciplinary and international research. On my campus, the music library, the art library, and the library with the best collection of literature are in three different buildings, about two miles apart. On my screen, they are all together. Similarly, I can easily question colleagues worldwide, whatever their specialty; people are not necessarily less accessible because they are not in my building. Davidson also argues for the importance of international cooperation in building digital resources, pointing to two particularly successful projects: the International Dunhuang Project, reuniting in virtual space manuscripts and paintings from Dunhaung (many now in London, Paris or St. Petersburg), making them easier to view and study; and the Law in Slavery and Abolition project, which brings together abolition materials from many countries.[3]

However, the physical location of humanities infrastructure has still to be settled. Should each research institution have a repository that covers all subjects of interest to it? Should there be different data centers for each subject or should there be a few large national centers? It is probably easier to get political support if many institutions are able to participate in the creation of cyberinfrastructure, even if that causes some duplicate work. Even with the wide disparity of size among North American universities, the low cost of computer hardware and the ease of distributing software platforms such as Fedora or DSpace means that almost everyone can have their own specialty. In the sciences, for example, Eckerd College (an institution with 1750 students), maintains a database of dolphins seen off the east coast of the United States, while in literature many campuses have claimed their specialty–whether it be Thackeray at Penn State, Twain at Berkeley, Tennyson at San Francisco State, or Thoreau at Santa Barbara. In the UK, the Arts and Humanities Research Council has recently suggested that data services now be provided by university repositories, rather than by the somewhat centralized Arts & Humanities Data Service.

One of the biggest problems still to be solved is that of scholarly access to copyright-protected materials. Regrettably, much of the discussion of cyberinfrastructure in the humanities gets bogged down in copyright law. It’s not, as it is in the sciences, about how to manage the data or what to do with it, but what you are allowed to do with it. Broadly speaking, literature and textual study is less affected by this concern than music and film scholarship, since most texts are in the public domain (first published in the U.S. before 1923). However, nearly all recorded sound, movies and films remain under copyright, as does most photography and a considerable amount of twentieth-century literature, painting and sculpture. And even in literature, most modern commentary is published under copyright restrictions.

Since data mining routines require open access to material, it is not clear how we will reconcile the financial interests of rights holders with the scholarly interests of researchers. The entertainment industry is of course trying to establish principles for the use of intellectual property online, which may wind up controlling what researchers can do. In many cases, however, rights holders such as museums are willing to provide special access to scholarly users. The Metropolitan Museum of Art, for example, has greatly simplified the process of getting reproduction-quality images of many objects in their collection.

2. Changes in the Natural Sciences
In the traditional paradigm of scientific research, a researcher poses some hypothesis, designs an experiment to test it, and carries out the experiment, evaluates the results, and then decides whether the hypothesis is valid. Running the experiment is typically the slowest step in this process. Today, the experiment, or data collection, is often not conducted at all, since in many scientific areas, there are now enormous collections of online information already available. Consider astronomy as an example. A researcher wanting to discover the comparative spectral characteristics of particular stars used to have to sign up for time at Kitt Peak or some other observatory, wait until the assigned two-week slot came around, go to the telescope, and take observations, hoping for good weather those nights. But now there are more than forty terabytes of sky images online. Archives such as the Sloan Digital Sky Survey or the Two Micron All Sky Survey[4] may well provide the necessary information.

Molecular biology was the first scientific area to be transformed this way. Much genomic research is now done entirely by comparing information from different databases, for example looking at the sequences of DNA base pairs (or the structures of biomolecules) from various species. This is not only faster than research using traditional “wet chemistry,” but encourages a different kind of research focused on searching out evolutionary similarities rather than the characteristics of single species. Among the best known archives are the Protein Data Base [5] and the genome data archives, such as the National Library of Medicine’s Genbank. The same is true with other sciences as well: high-energy physics, climatology, and seismology all have large and growing data archives.

In addition to the use of research data by the science that gathered them, data-oriented research is also being conducted in the archives, from the creation and evaluation of new interfaces for access and retrieval, to methods for data mining and experiments on using data in education–the results from which may be applicable to other scholarly areas. For example, visualization techniques for displaying charts of temperatures, such as spreadsheet chart generators, can also display the use of words in documents or ages of buildings, while more complex 3-D visualization software can show chemical molecules or sculptures.

3. Data in the Social Sciences
Social scientists have been gathering large quantities of data for years, through public surveys, economic monitoring and other methods. The Inter-university Center for Political and Social Research (ICPSR, established in 1962, is the world’s largest archive of digital social science data, maintaining more than a terabyte of information from surveys. “Quantitative history” also got its start in the 1960s, exploiting economic data, genealogical data, polling data and more, yielding important insights and results (such as, in just one example, the evocation of the life of textile workers by Clubb[6]).

Social scientists also exploit data collected in the natural sciences. For example, geographic information systems are widely applicable in both areas. In archeology and ethnographic studies, GIS data is fundamental, and of course it is relevant in history as well.[7] We can make far better maps of battles from Marathon to Waterloo than any participant (or even a 19th century expert such as Major-General Stanley) could have.

One particular social science application, “collaborative filtering,” based on work by Hill that makes automatic predictions about the interests of a user by collecting information about the preferences of many users, is now a familiar feature of commercial sites.[8] This shares the social effects of citation indexing, but is applicable in wider areas, since it does not depend on the use of formal citations. If databases were centralized, and they tracked usage, it would be technically possible to suggest resources to scholars, even while maintaining anonymity.

4. Humanities: Beyond Data Accumulation
The rate of the accumulation of digital data in the humanities has been increasing rapidly. Digital conversion of text dates from the 1950s, when scholars realized it was much easier to make concordances on computers than by hand. In the 1970s, Project Gutenberg, in the process of converting all major works of English and American literature to machine-readable form, shifted to scanning text, rather than keying it into computers. The Trésor de la Langue Française was also busily providing digital text for most works of classical French literature (now 2600 texts available through the ARTFL Project), and the Thesaurus Linguae Graecae digitized most literary texts written in Greek from Homer to the fall of Byzantium. With the advent of the Million Book Project, Google Books, Amazon’s “Search Inside”, the Open Content Alliance and Microsoft’s books.live.com, an enormous fraction of the printed literature is now searchable and many are actually readable. Estimating the number of books published in English as around 20 million, of which about five million are out of copyright, more than 1 million seem to be searchable now.[9] Scanning technology continues to evolve at breakneck speed. Until 2005, the best book scanners (such as the Minolta PS3000) were still moving a scanning element across a page. With the rapid increase in the resolution of digital cameras, today an entire page can be digitized at once with adequate resolution for reading or optical character recognition.

Archiving, preserving and ensuring permanent availability of this data has become of vital importance, and digital archives are becoming essential to the future scholarly enterprise. Not only must an archive be able to keep items indefinitely and find them again, it must convince its users that it will be able to do this. Scholars are not likely to be happy to deposit work in an archive if they do not believe that it is permanent.[10] A NITLE survey  reported that faculty expect to be increasingly dependent on electronic resources and less on their institution’s library, although they also reported that the libraries are less aware of this trend than are their patrons.[11]

It might be that the focus of study shifts as new projects become possible using scanned works. It is now trivial to count how many times a letter or word occurs in an author’s works, but perhaps no easier to discuss the sources of an author’s inspiration or the relationship of somebody’s texts to contemporary culture. It is not clear whether projects that involve automatic analysis of humanistic materials will become as important as scientific data analysis. Brockman argues that what humanists do is read; for him, what matters is how people find their material; once scholars have a document, they won’t make any other machine-readable use of it. Similarly, Unsworth’s overview of humanities scholars of the future focuses on the importance of searching, especially in heterogeneous bodies of media and materials.[12] Again, we would hope than in the future people will not just use online resources to find the traditional materials, but to actually analyze the materials themselves.

5. Imagery: Paintings, Drawings and Photographs
Museums were initially very concerned about the misuse of their images online, and the scanning of cultural heritage visual works lagged behind text conversion. However, it has quickly picked up and many cultural heritage institutions have taken the lead. Examples now include the websites of the Museum of Fine Arts in Boston (330,000 images), the Musée du Quai Branly (100,000 images), the Fine Arts Museums of San Francisco (82,000 images), and the Victoria and Albert Museum (43,000 images). More broadly of course there are the 9 million digital objects in the “American Memory” site of the Library of Congress and the 550,000 images of photographs, prints and documents in the New York Public Library’s “Digital Gallery.” The Mellon’s ARTstor project contains over 500,000 images, available for educational licensing, covering a very wide variety of artists, periods, and cultures. A larger fraction of photographs are in copyright than paintings, and some very large agencies (Corbis and Getty Images, for instance) have enormous photographic resources.

While these visual resources are now being used increasingly by faculty in many other disciplines, trends to watch in the area of digital investigation of the visual arts include:

  • Investigating the characteristics of paintings using digital techniques: Maître presented mathematical methods for analyzing the light reflected from paintings, in order to understand which pigments were used; Hardeberg showed how to analyze a painter’s palette, for example studying the range of colors in paintings by La Tour, Goya and Corot; and Lombardi investigated automatic analysis of style.[13]
  • Using digital images for restoration purposes: Pappas investigated restoring faded colors in paintings, and Giakoumis at patching cracks digitally. Although today it is typical that such studies are done when paintings are originally scanned, in the future one can expect research into materials will be separated from their digitization, as has already happened in science and in literature.[14]
  • Annotating paintings and photographs: Wang showed how to use machine-learning software to decide how to label different images: images such as terracotta warriors or roof tiles were recognized with accuracies ranging from 82% to 98%, with fewer than 10 training examples for each (holding out hope for content-based image search for art materials).[15]
  • Conducting authorship studies: Li and Wang examined Chinese paintings extracting stroke and wash information useful for distinguishing different artists from classical to modern times; Lyu examined Brueghel drawings in order to determine features that would identify authorship.[16]

6. Music and Moving Images
Scholars working with music and with moving images are seriously compromised by copyright issues. In the United States, virtually all recorded music is protected by copyright and the recorded music industry is vigilant in enforcing copyright law, with no effective understanding or agreement about the practical implementation of “fair use.” Articles about music will be available from online journals, catalogs will help locate relevant materials, and performance reviews will be relatively easy to locate in cities with newspapers online. However, listening to performances will not be so easy, as performances available for sale online tend still to comprise a narrow range of currently popular music.

Scores published before 1923 in the United States are out of copyright, and sites such as the Lester Levy Collection at Johns Hopkins provide a large variety of old songs in sheet format. To help amateur enthusiasts, some websites have converted classical music to MIDI format, and musical OCR software can assist in converting scores to MIDI. Online music search and retrieval is being actively developed, although it is behind the level of text retrieval.[17] Music scholarship is thus more likely to be based in traditional cataloging, references from other scholars, and personal experiences. Library resources are still critical: the music industry’s insistence in maintaining copyright control does not mean that it is keeping old material available for sale.[18]

Automatic musical analysis also lags behind textual analysis. Bill Birmingham developed methods for extracting themes from complex scores, but we generally do not have the kind of authorship studies that exist for text or art.[19] Netthiem’s summary of the situation concludes that “few applications of statistics in musicology have so far been fully convincing.”[20] When music scholarship is about general cultural context, it will of course benefit from the web resources created for textual material.

For film scholars, the availability of DVD recordings has made it enormously easier to watch films that were very rarely seen when only theatrical performances were possible. Generally, the money spent on buying DVDs is comparable to that spent on books (about $24 billion per year) but few libraries purchase DVDs at anything like the same rate at which they purchase books. Similarly, the state of preservation of film and video for future study is behind that for printed material. About half the movies made before 1950, for example, are lost, as is nearly all early television. The Vanderbilt Archive of televised news is a notable exception, providing a searchable collection of copies of the main network newscasts back to 1968.

With NTSC-quality digitization notably less accurate than film (even HDTV quality is below that of 35mm film), film scholars may feel that even collections of current DVDs are not suitable for in-depth study, while for television programs, DVDs are better than the quality of what was broadcast.  For successful commercial films, conversion is being done by the movie companies. But for material with no likely market, it is unclear how it will be preserved. Film libraries at places like USC, UCLA, NYU, the American Film Institute, and the Museum of the Moving Image, just to mention a few, are attempting to address this. In some cases, commercial companies are cooperating with these non-profit organizations to support scholarly research into cinema. We can hope that in the future, when the commercial value of old movies is better understood, more research use will be allowed in circumstances when the film owners realize there is little financial risk.

7. Sculpture, Architecture, and Urban Design

With the development of a variety of techniques for 3-D scanning, it is now possible to model three-dimensional forms in space. One example is Marc Levoy’s imaging of Michelangelo’s David. Using a laser rangefinder he imaged the sculpture to an accuracy of 1/4 mm, enabling scholars to tell which chisel Michelangelo had used on what parts of the stone.[21]

While contemporary buildings are designed using 3-D CAD programs, earlier ones can be represented as 3-D models (using software such as Photomodeler or Realviz). Virtual reconstructions can also show buildings at different times in their history, and can even represent buildings that no longer exist (the accuracy of the reconstruction depending on the level of documentary detail available. In Germany, the group “Memo 38” completed a 3-D reconstruction of the destroyed synagogue at Wiesbaden, aided by drawings, photographs, memories and architectural expertise, and has been extended to include eleven other synagogues by Koob and others at Darmstadt.[22]

Other reconstructions include architect Dennis Holloway’s images of Native American structures from the southwest, re-created from drawings, foundation measurements and surviving elements; the images have a tremendous impact for tribal artists who can “see” complete buildings that their culture once used, instead of just the ruins that remain. Entire modern cities (such as Los Angeles and Kyoto) have been recreated and may have applications ranging from tourism to emergency response, while the most ambitious historical reconstruction is Rome Reborn, a network of sites that will ultimately include the modeling of 7,000 buildings. Jacobson, looking at applications for 3-D archaeological models, remarks that many users will benefit from seeing an object rather than reading about it, and even more from being able to walk through it, using their memory of spaces and motion. So the use of virtual reality models of historical cities or sites can be expected to increase public interest in and support of cultural heritage.[23]

8. Born-Digital Creations
Artists are now using computers in a wide variety of creative ways extending over music, dance, text, imagery, and interactive software games. The analysis of this material by scholars has barely started, and it is not clear who will both be willing to collect it and actually have legal permission to do so–libraries, museums, or the artists themselves.[24] Worse yet, this kind of material is likely to depend on the details of the computer software used to make it and show it; even the increases in processor speed that we see every few months may change the emotional impact of the work as the speed of the display changes. Preserving and accessing the material may depend on software from companies that go out of business or discontinue support. One dramatic example of the problems posed by digital preservation is that of the BBC’s digital Domesday book. In celebrating the 900th anniversary of William the Conqueror’s 1086 survey of England, the BBC built its 20th-century version on an analog laser-disc technology that disappeared from the marketplace almost immediately, along with the BBC Micro computer that ran the software to operate the player. As a result, the Digital Domesday became rapidly unavailable, while the 1086 version was still readable. The BBC has recovered the original material, and built an equivalent website, but has legal difficulties making it generally available.[25]

9. Future: the economics and politics of access
There still remain many unsolved economic, legal and political problems surrounding digital resources and scholarly access. We need long-term financial support for the building and preservation of digital resources. Although some universities have received large donations for computational humanities research, most notably the University of Virginia, it is more common for this work to be supported by research grants, which do not provide indefinite funding for long-term storage and support. The National Science Foundation has recently announced a competition for digital archiving support, which is most welcome as a possible source of funding. We also have library cooperatives, such as LOCKSS, that try to leverage individual library contributions into a large shared system. The Mellon Foundation stands out for its support of long-term archiving and research into new business models. However, we still do not know what kind of business model will support sustainable digital resources. Should repositories be organized by university or by discipline? Should there be a few very large ones or many repositories with small specialties? Should funding be sought as endowment funds, pay per use, subscriptions, or something else? Fortunately, one cause for optimism is the steadily declining cost of disk space and computer equipment; if you can afford to keep something this year, the cost to keep those bits around in five years will be less.

What level of access can be provided for material still under copyright? Our Cultural Commonwealth, the cyberinfrastructure report of the American Council of Learned Societies, urges that all content be freely available under open access. It does not, however, put forward any specific goal for content, nor does it make recommendations for dealing with the tough intellectual property issues that restrict online use of nearly all recorded music, film and video, and large chunks of photography.

In an early survey of archaeologists’ use of data, Condron found widespread interest in mapping data and site records, but great confusion over whether access should be open and free or whether there should be fees to offset production costs. This leads to the publication-related question of whether or not the presence of online materials may inhibit traditional paper publication. If faculty members can only get tenure via traditional publications, they will tend to resist anything that might cause a decline in traditional publishing. Meanwhile, tenure evaluation committees are starting to ask for citation counts and journal impact factors; Stevan Harnad has shown that online publications are now generally more cited than works that appear only on paper. A faculty member in the future may be more anxious to have a website that is highly-visited and linked-to than a journal paper. Overall though, the situation is so confused currently that we don’t understand whether data archives should charge those who put things in, those who take things out, both, or neither.[26]

Even in scientific areas where copyright is less relevant, we still find political or ethical issues affecting the availability of infrastructure data. Should the scientist who first collects some information have special privileges to use it? Different areas have different ethical rules. Anyone publishing a paper that reports measuring a protein structure is expected to deposit the structure in the Protein Data Bank immediately, while astronomers have a convention of two years private use, despite the potential enormous commercial importance of some protein structures compared with the complete absence of commercial applications for cosmology. Of course, the Dead Sea Scrolls were kept secret for decades: humanities communities will also have to work out whether any rights attach to scholars as well as to the original creators or publishers of works, and how long these should last.

Most important in the long run will be the development of better techniques for analyzing and using the data accumulated in humanities repositories. Scholars can find works, they can view works, at least as surrogates, and they can exchange information with other scholars. Repositories thus make traditional research easier. But will they enable new and significant kinds of research? We would like to see more authorship studies, critical evaluation, annotation, and the like. Today, computers can count; they can read a little, see a little, hear a little, and feel a little. But as yet they do not read, see, hear, or feel at the levels needed to provide insights for humanities scholars.

NOTES

[1] W. J. Clinton, State of the Union Address (1998), http://clinton4.nara.gov/textonly/WH/SOTU98/address.html.

[2] T.S. Eliot, Complete Poems and Plays 1909-1950 (New York: Harcourt, 1952), 96.

[3] C. N. Davidson, “Data Mining, Collaboration, and Institutional Infrastructure for Transforming Research and Teaching in the Human Sciences and Beyond,” CTWatch Quarterly 3, no. 2 (2007), http://www.ctwatch.org/quarterly/articles/2007/05/data-mining-collaboration-and-institutional-infrastructure/.

[4] M. Skrutskie, The Two Micron All Sky Survey at IPAC (2007), http://www.ipac.caltech.edu/2mass/.

[5] See http://www.pdb.org or http://www.rcsb.org and links therein. Protein Data Bank project leaders are Dr. Helen Berman (Rutgers) and Dr. Philip Bourne (UCSD).

[6] J. M. Clubb, Erik W. Austin, and Gordon W. Kirk, Jr., The Process of Historical Inquiry: Everyday Lives of Working Americans (Columbia University Press, 1989).

[7] D. Rumsey, “Tales from the Vault: Historical Maps Online,” Common-place 3, vol. 4 (2003), http://common-place.dreamhost.com//vol-03/no-04/tales/index.shtml, http://purl.oclc.org/coordinates/b3.htm.

[8] See W. Hill, Larry Stead, Mark Rosenstein and George Furnas, “Recommending and evaluating choices in a virtual community of use,” Human Factors in Computing Systems, in CHI ’95 Conference Proceedings, (ACM, 1995), 194-201.

[9] J. Unsworth, et al, “Supporting Digital Scholarship,” in SDS Final Report (University of Virginia IATH, 2003), http://www3.iath.virginia.edu/sds/SDS_AR_2003.html.

[10] This has been estimated by comparing the titles found in a large library of known size with those in the various online systems. This statistic is somewhat misleading since the most important and frequently used books appear in major libraries the most often and are likely to be more quickly entered into the major scanning efforts. For example, Prescott’s History of the Conquest of Mexico can be searched in every one of the four big projects mentioned above.

[11] R. Schonfeld and Kevin M. Guthrie, “The Changing Information Services Needs of Faculty,” EDUCAUSE Review 42, no. 4 (2007), 8-9.

[12] W. Brockman, Laura Neumann, Carole Palmer, and Tonyia Tidline, Scholarly Work in the Humanities and the Evolving Information Environment (Washington, D.C.: Digital Library Federation and Council on Library and Information Resources, 2001), http://www.clir.org/PUBS/reports/pub104/pub104.pdf; J. Unsworth, The Scholar in the Digital Library, IATH (Charlottesville, 2000), http://www.iath.virginia.edu/~jmu2m/sdl.html.

[13] H. Maître, Francis Schmitt, Jean-Pierre Crettez, Yifeng Wu and John Yngve Hardeberg, “Spectrophotometric image analysis of fine art paintings,” Proc. IS & T and SID 4th Color Imaging Conf. (Scottsdale, AZ, 1996), 50-53; J.Y. Hardeberg, Jean-Pierre Crettez, and Francis Schmitt, “Computer Aided Image Acquisition and Colorimetric Analysis of Paintings,” Visual Resources: an International Journal of Documentation 20, no. 1 (2004), 67-84; T. Lombardi, “The Classification of Style in Fine-Art Painting,” PhD diss., Pace University (2005), http://csis.pace.edu/~lombardi/professional/dthesis.html.

[14] M. Pappas and Ioannis Pitas, “Digital Color Restoration of Old Paintings,” IEEE Trans. on Image Processing 9, vol. 2 (2000), 291-294; I. Giakoumis and Ioannis Pitas, “Digital Restoration of Painting Cracks,” Proceedings of the 1998 IEEE International Symposium on Circuits and Systems 4 (1998), 269-272.

[15] J. Z. Wang, Jia Li, and Ching-chih Chen, “Machine Annotation for Digital Imagery of Historical Materials using the ALIP System,” in Proc. DELOS-NSF Workshop on Multimedia in Digital Libraries (Crete, 2003).

[16] J. Li and James Z. Wang, “Studying Digital Imagery of Ancient Paintings by Mixtures of Stochastic Models, IEEE Transactions on Image Processing 13, no. 3 (2004), 340-353; S. Lyu, Daniel Rockmore, and Hany Farid, “A digital technique for art authentication,” Proc. Nat. Acad. Sci. 101, no. 49 (2004), 17006-17010.

[17] J.S. Downie, “Music Information Retrieval,” Annual Review of Information Science and Technology 37 ( 2003), 295-340.

[18] T. Brooks, “Survey of Reissues of U. S. Recordings” (sponsored by the Council on Library and Information Resources and the Library of Congress, Washington, D.C., 2005), http://www.clir.org/PUBS/reports/pub133/pub133.pdf.

[19] W. Birmingham, Bryan Pardo, Colin Meek, and Jonah Shifrin, “The MusArt Music-Retrieval System,” D-Lib Magazine 8, no. 2 (2002).

[20] Nigel Nettheim, “A Bibliography of Statistical Applications in Musicology,” Musicology Australia 20 (1997), 94-106.

[21] M. Levoy, K. Pulli, B. Curlesss, S. Rusinkiewicz, D. Koller, L. Pereira, M. Ginzton, S. Anderson, J. Davis, J. Ginsberg, J. Shade, and D. Fulk, “The digital michelangelo project: 3d scanning of large statues,” Computer Graphics, SIGGRAPH 2000 Proceedings (2000), 131-144.

[22]  G. Kerscher, “Architecture Digitalized” (talk, Section 23D of CIHA conference, London, England, September 3-8, 2000), http://www.unites.uqam.ca/AHWA/Meetings/2000.CIHA/Kerscher.html; M.  Koob, Synagogues in Germany–A Virtual Reconstruction, http://www.cad.architektur.tu-darmstadt.de/synagogen/inter/start_de.html (accessed November 9, 2007).

[23] D. Holloway, “Native American Virtual Reality Archeology,” Virtual Reality in Archeology, ed. Juan Barcelo (London: Archeo Press, 2000);  B. Jepson, Urban Simulation Team at UCLA, http://www.ust.ucla.edu/ustweb/PDFs/USTprojects.PDF (Note videos and models at http://www.ust.ucla.edu/ustweb/projects.html, accessed November 9, 2007); Y. Keiji, “Virtual Kyoto through 4D-GIS and Virtual Reality,” Ritsumeikan 2, no. 1 (Winter 2006), http://www.ritsumei.ac.jp/eng/newsletter/winter2006/gis.shtml (Note link to the actual 3-D models); J. Jacobson and Jane Vadnal, “Multimedia in Three Dimensions for Archaeology,” Proceedings of the SCI’99/ISAS’99 Conference (1999), http://www.planetjeff.net/IndexDownloads/sci-isas-99.pdf.

[24] J. Lewis, “Conserving Pixels, Bits, and Bytes,” Artinfo (August 2, 2007), http://www.artinfo.com/articles/story/25439/conserving_pixels_bits_and_bytes?page=1.

[25] J. Darlington, Andy Finney and Adrian Pearce, “Domesday Redux: The rescue of the BBC Domesday Project videodiscs,” Ariadne 36 (2003), http://www.ariadne.ac.uk/issue36/tna/; A. Charlesworth, Legal issues arising from the work aiming to preserve elements of the interactive multimedia work entitled “The BBC Domesday Project,” (2002), http://www.si.umich.edu/CAMILEON/reports/IPRreport.doc .

[26] F. Condron, Julian Richards, Damien Robinson and Alicia Wise, Strategies for Digital Data (York: Archeology Data Service, 1999); C. Hajjern, Y. Gingras, T. Brody, L. Carr and S. Harnad, “Open Access to Research Increases Citation Impact,” (paper published by the Electronics and Computer Science Eprints Service, Univ. of Southampton, 2005), http://eprints.ecs.soton.ac.uk/11687/.