We see the emergence of a new, ‘fourth paradigm’ for scientific research involving the acquisition, management and analysis of vast quantities of scientific data. This ‘data deluge’ is already affecting many fields of science most notably fields like biology with the high through-put gene sequencing technologies; astronomy with new, large-scale, high-resolution sky surveys; particle physics with the startup of the Large Hadron Collider; environmental science with both new satellite surveys and new deployments of extensive sensor networks; and oceanography with the deployment of underwater oceanographic observatories. This revolution will not be confined to the physical sciences but will also transform large parts of the humanities and social sciences as more and more of their primary research data is now being born digital. This new paradigm of data-intensive scientific discovery will have profound implications for how researchers ‘publish’ their results and for scholarly communication in general. The details both of what will need to be preserved and how this will be accomplished to create an academically valid record of research for the future are only now beginning to emerge. What is clear, however, is that research libraries have the opportunity to play a leading role in this ongoing revolution in digital scholarship. Repositories for both text and data are certain to play an important role in this new world and specialists in semantics, curation and archiving will need to work with the different research communities to fulfill their needs. Relevant projects and key collaborations recently undertaken by Microsoft Research will be highlighted, as will other Microsoft efforts related to interoperability and digital preservation.
by Patricia Manson
One irony of the information age is that keeping information has become more complex than it was in the past. We not only have to save physical media and electronic files; we also need to make sure that they remain compatible with the hardware and software of the future. Moreover as the volumes of information, the diversity of formats and the types of digital object increase, digital preservation becomes a more pervasive issue and one which cannot be handled by the current approaches which rely heavily on human intervention. Research is needed on making the systems more intelligent.
For the research community, the challenge is also to build new cross-disciplinary teams that integrate computer science with library, archival science and businesses. We need to ensure that future technology solutions for preservation are well founded and grounded in understanding what knowledge from the past and from today we need to keep for the future.