“USC Digital Repository – now open for time immemorial,” reads this advertisement from the hosts of Screening the Future 2012. But how can they prove that they will deliver? Trust is a recurring theme in digital preservation, “dispositional” trust, as masterclass chair Raivo Ruusalepp termed it, as we will not be around to see whether the archive will deliver on its promises. He quoted Kevin Ashley (of the UK Digital Curation Centre) as saying: “Trust is never a certainty.” – by Inge Angevaare
Ah well, nothing in life is certain. But repositories and digital archives can build both self-confidence (Are we doing the right things?) and trustworthiness (for their clients and funders) by embracing industry standards and by engaging in audits that make their policies and workflows transparent to anyone depending on their services.
OAIS: a reference model and vocabulary
The most well-known general standard in our digital preservation book is, of course, the Open Archival Information Systems Reference Model (OAIS) developed by the Consultative Committee for Space Data Systems (CCSDS) in 2002 (for reference: Lavoie’s introductory article on OAIS). Bruce Ambacher himself was involved in designing the model. About his experience, Ambacher said, “Just try talking to space scientists as an archivist.” In order to be able to even talk to each other, a common glossary had to be established – and that common vocabulary across disciplines and domains is still one of the great contributions of the OAIS model. Plus the information model developed in the OAIS standard: what metadata do you need to do what.
But OAIS is not a standard for quality of service, or something you can use for an audit. The CCSDS plan to add an ISO standard for audit and certification to the reference model was finally fulfilled in February 2012, with the adoption of ISO16363 (for text without ISO cover) (which is based on earlier work by OCLC and RLG, known as TRAC, Trusted Digital Repositories Audit and Certification).
Audit and certification: ISO16363
To give you an idea of what ISO16363 is all about, Ambacher gave an example metric:
126.96.36.199 The repository shall have an appropriate succession plan, contingency plans, and/or escrow arrangements in place in case the repository ceases to operate or the governing or funding institution substantially changes its scope.
Supporting Text: This is necessary in order to preserve the information content entrusted to the repository by handing it on to another custodian in the case that the repository ceases to operate.
Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement: Written and credible succession and contingency plan(s); explicit and specific statement documenting the intent to ensure continuity of the repository, and the steps taken and to be taken to ensure continuity; escrow of critical code, software, and metadata sufficient to enable reconstitution of the repository and its content in the event of repository failure; escrow and/or reserve funds set aside for contingencies; explicit agreements with successor organizations documenting the measures to be taken to ensure the complete and formal transfer of responsibility for the repository’s digital content and related assets, and granting the requisite rights necessary to ensure continuity of the content and repository services.
Discussion: A repository’s failure threatens the long-term sustainability of a repository’s information content. It is not sufficient for the repository to have an informal plan or policy regarding where its data goes should a failure occur. A formal plan with identified procedures needs to be in place.
A three-step road map
Does this sound like rather a lot to take on? Ruusalepp explained that the Europeans thought of that and developed a three-tiered European Framework for Audit and Certification of Digital Repositories. In three steps an organization can “grow” towards formal certification as a trusted repository:
- Step 1 is a self-assessment based on the Data Seal of Approval. This is a very lightweight instrument with 16 criteria that outline the basics of good data management. It was developed for research data archives, but may be applicable elsewhere. You assess your own organization according to the 16 criteria and submit the evaluation to peer review by the DSA organizers.
- If you have that down, you can move on to auditing yourself against the much more extensive ISO16363 requirements.
- Step 3 is official certification by means of a third-party audit based on ISO16363.
Once we have a standard and metrics, the next question is: Who is qualified to carry out an audit? The US PTAB (Primary Trustworthy Repository Authorization Body) is working on a standard (ISO16919) for auditing bodies, which is expected in 2013. Meanwhile, the European APARSEN project has carried out a number of test audits, with two purposes: to test the metrics of the standard and to train prospective auditors. There is a published report of test audits by the APARSEN project (see also test team leader David Giaretta’s first impressions, post).
Issues with regard to audit and certification
Apart from the question about the auditors, there are other issues that need to be worked on. For example: can you really express everything in metrics? Ruusalepp stressed that any audit is but a snapshot; trust is much more than that, it is a relationship.
Standards for audiovisual material
James Snyder listed the standards the Library of Congress is using for AV material:
- JPEG2000 lossless for moving image
- LTFS, Linear Tape File System
- MXF, Material eXchange Format (SMPTE 377); expected to be succeeded by AXF, Archival eXchange Format, under development (will accommodate very large UHDTV files which may come at 25 (!) LTO-5 tapes per 15 minutes).
- Metadata: XML schema is standard; PBCore for audiovisual.
Snyder went on to give a detailed account of how the Library of Congress is making its work as transparent as possible (watch the forthcoming video for full details). Some of his more notable remarks:
- It is important to view the entire system and the entire lifecycle.
- Use standards and make sure you collect documentation about the standards.
- “Resist the bright and shiny things, do your homework.”
- With regard to standards: “We cannot and should not stop people from being creative.”
- Plan your preservation activity early, get in touch with the producers.
- “Born-digital content gives me a heart-ache and keeps me up at night.” (Prescription: keep the originals; create additional “evergreen” files; wrap them together if possible).
- Any conversion of the files may cause file damage, so do not convert more than is necessary.
- When budgets are slashed, there is always the temptation to postpone a (media) migration. That is very risky.
- “Never build black boxes.”
- “Preferably we use off-the-shelf products now; if need be, we may tweak them.”
- “If someone says: ‘We’ll do this in five years,’ that is a code word for ‘never’.”
- On dealing with uncertainty: “The job is not impossible, but you have to get your head around it, and take the first step.”
Gecategoriseerd in :Geen categorie