Digital document preservation By Yogesh Mathur 22 September 2018 What is Digital document preservation Digital document preservation is a process by which digital data is preserved in digital form in order to ensure the usability, durability and intellectual integrity of the information contained therein Need for digital preservation Vast amount of ‘born-digital’ data, especially in science and engineering Physical deterioration: medium is vulnerable to deterioration and catastrophic loss. Digital obsolescence: Digital technology is on a fast track Digital preservation: issues data is maintained in the repository without being damaged, lost or maliciously altered; data can be found, extracted and served to a user; data can be interpreted and understood by the user; and the above can be achieved in the long term. Digital preservation process Organisational Managerial and Technical Organisational issues Digital preservation policy Justification for preservation Organisational and financial commitment Preservation of authentic resources and quality control Metadata creation High-level identification of roles and responsibilities Training and education Managerial issues Preservation planning Developing strategy Taking sole responsibility for preservation Dealing with IT staff or external preservation service providers Methods of digital preservation Bitstream Copying Durable, Persistent Media Standards Migration Emulation Encapsulation Preservation Metadata Challenges of digital preservation Technology obsolescence Absence of established standards, protocols, and proven methods for preserving digital information technological or economic feasibility of operating on a mass scale Case study 1: E-mail preservation Electronic Mail Now ubiquitous in many business contexts A mixture of records and other stuff High-risk if not managed properly: Loss of accountability, efficiency, public credibility, organisational memory, etc. There also may be legal and financial consequences An obvious candidate for the records management approach Some specific challenges of E-mail Inappropriate content For example: spam, personal messages, illegal content Wide range of attachment types – some will provide preservation challenges of their own Unclear responsibilities: Users can be reluctant to ‘manage’ incoming mail E-mail seen as personal domain, not as organisational property ... this can have consequences … Approaches to managing e-mail Developing specific policies for managing email within an organisation Produce guidance for creators (and others) Identify the chain of custody through lifecycle Need to involve all people involved, e.g. creators, managers, records managers, IT staff, etc. E-mail preservation Appraisal Determining what content needs to be preserved Destruction of transient/unnecessary e-mails Saving e-mail records independently of the e-mail client Check that content is complete - comprising message body, headers & attachments Consider authenticity requirements Ingest into an organisational EDRMS or repository Make decisions on appropriate preservation strategies for content and attachments Selecting a standard format? Significant properties?