SafeArchive
Overview
The Data-PASS partners, with Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP) funding, have built a storage platform for policy-driven distributed replication of digital holdings. The partners have received funding from the Institute of Museum and Library Services to develop this prototype into a self-contained system that can be installed, used, and maintained by institutional staff without technical expertise. This will result in a set of open source tools that can easily be used by libraries, museums, and archives that wish to collaborate in replicating their own content.
How it will work

This prototype system is built around a core LOCKSS network. Participating institutions will expose content through the OAI-PMH protocol or through the Dataverse Network (DVN) digital library system. Institutions participating in the network will then choose which of their own and which of the other partners' content to replicate by creating policies (or rules), which will be formalized in a machine-readable schema. Since partners vary in size and technology, the policy commitments will be able to scale to the participants' resources.
The complete public holdings of each partner, including metadata, data, documentation, and legal agreements, arewi replicated by the network. Replicated copies are geographically and institutionally distributed, which guards against technical and organizational preservation failures.
Version 1.2 will introduce automatic provsioning. When new collections are added to the preservation network, the system will provide a way to automatically identify collaborating peers with the required resources, and initiates regular harvesting by those peers. Previous versions of the replicated content will be maintained, as well.
Content in the LOCKSS network will be audited regularly to demonstrate conformance with preservation requirements. While each partner will be trusted to hold others' public content and to not disseminate content improperly, no partner will be trusted to have "super-user" rights. Trust will be verified through automated audits of trusted repository requirements, which will provide the reliability of a top-down replication system with the resilience of a peer-to-peer model.
Version 1.1
Version 1.1 has been relesed including
- Extensions to the Dataverse Network System to allow any dataverse owner to expose selected content for replication harvesti
- SafeArchive core system, supporting installation, policy configuration, and monitoring of LOCKSS networks.
- Grouping of archival collections and hosts for policy
Planned Releases -- Spring 2012
- Version 1.2 of the SAFE-Archive system, supporting auto-reconfiguration of LOCKSS networks to reflect changes to policy schema.
- Online courses and written guides.
More information
For a more detailed overview see:
Altman, M., & Crabtree, J. (2011). Using the SafeArchive System : TRAC-Based Auditing of LOCKSS. Archiving 2011. Society for Imaging Science and Technology.
Altman, M., Beecher, B., Crabtree, J., Andreev, L., Bachman, E., Buchbinder, A., Burling, S., King, P., & Maynard, M. (2009). A Prototype Platform for Policy-Based Archival Replication. Against the Grain 21(2).
For current code and documentation, see: http://safearchive.sourceforge.net/.
For other presentations and publications, see: http://www.icpsr.umich.edu/DATAPASS/presentations.jsp.






