Skip to main content

A Matter of Trust: A Matter of Trust

A Matter of Trust
A Matter of Trust
  • Show the following:

    Annotations
    Resources
  • Adjust appearance:

    Font
    Font style
    Color Scheme
    Light
    Dark
    Annotation contrast
    Low
    High
    Margins
  • Search within:
    • Notifications
    • Privacy
  • Project HomeA Matter of Trust
  • Projects
  • Learn more about Manifold

Notes

table of contents
  1. Cover
  2. Title Page
  3. Dedication
  4. Copyright
  5. Contents
  6. Acknowledgements
  7. About the authors
  8. Introduction
    1. Background
  9. 1. Records as evidence for measuring sustainable development in Africa
    1. Breakdown of records systems in Africa
    2. Records management, structural adjustment, public sector reform and computerisation
    3. Consequences for Africa of losing control of records
    4. Open data and records management
    5. Conclusion
  10. 2. The state of data and statistics in sub-Saharan Africa in the context of the Sustainable Development Goals
    1. Defining the terms statistics and data
    2. Census data
    3. Statistical activities in Africa
    4. SWOT analysis
    5. Overcoming the challenges
    6. Conclusion
  11. 3. Data, information and records: exploring definitions and relationships
  12. 4. The potential – constructive and destructive – of information technology for records management: case studies from India
    1. The Mahatma Gandhi National Rural Employment Guarantee Act
    2. Aadhaar
      1. Leaks and the system’s vulnerability to penetration
      2. Coercive action by a government in a hurry
      3. ‘Inhuman and illegal’: malfunctions and denials of services cause hardships
      4. Curbing – and enabling – corruption
  13. 5. Statistical accuracy and reliable records: a case study of mortality statistics in The Gambia
    1. Background
    2. Mortality rates in The Gambia
      1. How are mortality rates calculated?
    3. Challenges for collecting reliable birth and death statistics in The Gambia
      1. How are deaths recorded?
      2. How are death rates estimated?
      3. The reliability of birth dates
    4. Efforts to strengthen official statistics in The Gambia
      1. The Gambia Bureau of Statistics
      2. The significance of records for mortality statistics and the contribution of the National Records Service
    5. The benefits of shared responsibility for the quality of statistics
    6. Summary and conclusion
  14. 6. Mainstreaming records and data management in sustainable development: lessons from the public and private sectors in Kenya
    1. The public sector experience in Kenya
    2. Mobile banking in Kenya
      1. Relationship to the SDGs
      2. How do data and records management support mobile banking?
    3. Building bridges between the sectors
    4. Conclusion
  15. 7. Open data and records management – activating public engagement to improve information: case studies from Sierra Leone and Cambodia
    1. Sierra Leone
      1. Open data in support of free and fair elections
      2. The potential records management contribution
    2. Lower Mekong, Cambodia: land investment mapping
      1. The open data initiative
      2. The potential for a records management contribution
    3. Key issues from the two case studies
    4. Conclusion
  16. 8. Assuring authenticity in public sector data: a case study of the Kenya Open Data Initiative
    1. Data authenticity
    2. The Kenya Open Data Initiative
    3. Land data
      1. Land information management
      2. Examining the land dataset
    4. Conclusion
  17. 9. Preserving the digital evidence base for measuring the Sustainable Development Goals
    1. Elements of a digital preservation capability
    2. Implementation options
      1. Doing nothing
      2. Using open source software
      3. Developing a bespoke solution
      4. Procuring a commercial solution
      5. Outsourcing the service
      6. Partnership approaches
      7. Hybrid approaches
      8. Using consultancy services
    3. Implementation and operational implications
      1. Implementing a digital preservation service
      2. Governance
      3. Roles and responsibilities
    4. Training
    5. Policies and procedures
    6. Conclusion
  18. 10. Preserving and using digitally encoded information as a foundation for achieving the Sustainable Development Goals
    1. Requirements for SDG data to be fit for purpose
      1. Authenticity
      2. Longitudinal studies
      3. Combining data
      4. Errors
    2. Collecting and preserving data for SDGs
      1. Semantic issues
      2. Proportions
      3. Unclear metrics
      4. Rates
      5. Number of countries
      6. Money
      7. Prevalence
      8. Structural issues
      9. Virtual data
      10. Input data
    3. Digital preservation and exploiting digital data
      1. Basic concepts in digital preservation
      2. Types of digitally encoded information
      3. Digital preservation
      4. Active data management plans
    4. Is it really being preserved? The importance of certification
    5. Getting to where we need to be
    6. Conclusion
  19. 11. Transparency in the 21st century: the role of records in achieving public access to information, protecting fundamental freedoms and monitoring sustainable development
    1. Current transparency initiatives are undermined by weak records and information management
    2. Weakness in records and information management is a widespread and persistent problem
    3. New digital forms of communication and conducting government business have exacerbated earlier weaknesses in records and information management
    4. Weak control of digital records and information weakens transparency and public accountability mechanisms
    5. Persistent cultures of secrecy lead to oral government and avoidance of record-making and keeping
    6. Good data are needed on records and information management implementation in support of transparency
      1. Policy
      2. Standards
      3. Roles and responsibilities
      4. Systems and practices
      5. Capacity
      6. Policy
      7. Standards
      8. Roles and responsibilities
      9. Systems and practices
      10. Capacity
    7. Steps that can be taken to strengthen records and information management
      1. Strengthen laws and policies governing digital records management
      2. Introduce independent records and information management oversight
      3. Align incentives of public officials with RIM principles and transparency policies and laws
      4. Encourage collaboration
    8. Conclusion
  20. 12. Information management for international development: roles, responsibilities and competencies
    1. Quality information for international development
    2. Key players in records management, their roles and responsibilities
      1. Group 1: professionals with the necessary technical skills and qualifications (such as records, IT) to ensure information quality
      2. Group 2: managers (senior, programme, functional) who enable or facilitate the work of the professionals
      3. Group 3: all other stakeholders and users of the information, inside and outside the organisation
    3. Capacity for managing records
    4. Capacity Level 1
      1. (Poor quality records undermine SDG implementation)
      2. Group 1: professionals
      3. Group 2: managers
      4. Group 3: other stakeholders and users
    5. Capacity Level 2
      1. (Records enable SDG implementation at a basic level)
      2. Group 1: professionals
      3. Group 2: managers
      4. Group 3: other stakeholders and users
    6. Capacity Level 3
      1. (The quality of records makes it possible to measure SDGs effectively and supports government programme activities)
      2. Group 1: professionals
      3. Group 2: managers
      4. Group 3: other stakeholders and users
    7. Capacity Level 4
      1. (Well-managed records make it possible to measure SDG implementation effectively and consistently through time; data and statistics are of high enough quality and integrity to support government programme activities at the strategic level)
      2. Group 1: professionals
      3. Group 2: managers
      4. Group 3: other stakeholders and users
    8. Capacity Level 5
      1. (Processes generating records, and the framework for managing them, are designed to make it possible to exploit data, statistics and records, including the information used for measuring SDGs, in new and innovative ways)
      2. Group 1: professionals
      3. Group 2: managers
      4. Group 3: other stakeholders and users
    9. Determining and achieving the desired capacity level
      1. Employ staff with formal qualifications
      2. Train existing staff
      3. Contract expert staff short term as change makers
      4. Use standards to guide practice and inform staff recruitment
      5. Benchmark staff skills and knowledge against competency standards
    10. Conclusion
  21. 13. The quality of data, statistics and records used to measure progress towards achieving the SDGs: a fictional situation analysis
    1. Background
    2. Organisation of the report
    3. Methodology
    4. Definitions
    5. Analysis
    6. The government of Patria and the SDGs
    7. Data collection and analysis at the ministry level
      1. Survey data
      2. Registration and administrative data
      3. Scientific data
    8. Data and records issues at the ministry level7
    9. Data and records issues at the NBS
    10. Implications of the failure to establish a management framework
    11. Strategies for sustainable solutions
    12. Laws and policies
      1. Issues
      2. Strategies
    13. Standards and practices
      1. Issues
      2. Strategies
    14. Systems and technologies
      1. Issues
      2. Strategies
    15. People
      1. Issues
      2. Strategies
    16. Management and governance
      1. Issues
      2. Strategies
    17. Awareness
      1. Issues
      2. Strategies
    18. Implementing the strategies
    19. Capacity levels to guide the way forward
      1. Level 1: poor-quality data, statistics and records undermine SDG implementation
      2. Level 2: data, statistics and records enable basic SDG measurement
      3. Level 3: the quality of data, statistics and records makes it possible to measure SDGs effectively and supports government programme activities
      4. Level 4: well-managed data, statistics and records make it possible to measure SDG implementation effectively and consistently through time; data and statistics are of high enough quality and integrity to support government programme activities at the strategic level
      5. Level 5: processes generating data, statistics and records, and the framework for managing them, are designed to make it possible to exploit data, statistics and records, including those measuring SDGs, in new and innovative ways
    20. First steps
      1. Identify a leader and assemble a team
      2. Identify processes as examples
      3. Describe the selected processes
      4. Identify issues and implications
      5. Develop strategies for resolving issues
      6. Apply the experience to other processes and to the framework for managing data/statistics/records
  22. Index

9. Preserving the digital evidence base for measuring the Sustainable Development Goals

Adrian Brown

The role of good recordkeeping in providing evidence for measuring the Sustainable Development Goals (SDGs) has been discussed in depth in Chapter 1 of this volume, ‘Records as evidence for measuring sustainable development in Africa’, as has the fact that data, documents and other forms of recorded information are increasingly digital in form. The need for digital preservation will only continue to grow, an issue that cannot be avoided if SDGs are to be met. Furthermore, the SDG agenda requires that these records remain accessible to 2030 and beyond, which, in the digital world, can only be achieved through continuous active management.

Digital preservation is not just a technical issue. It also requires an ecosystem of organisations, policies and standards, resources and people. Demonstrating whether or not SDGs have been achieved requires the management of the evidence base, which in turn is dependent on the application of digital preservation methodologies. The significance of developing digital preservation capacity for maintaining reliable SDG measurements over time, as a fundamental component to the SDG agenda, cannot be overstated.

The main challenges of collecting and preserving information specified by the SDGs in ways that will allow it to be combined and compared will be discussed further in David Giaretta’s chapter on ‘Preserving and using digitally encoded information as a foundation for sustainable development’ (Chapter 10 in this volume).

This chapter considers the practical implications for developing digital preservation capabilities. It begins by considering the component elements of such a capability, and it examines how the concept of maturity models can be used to help organisations define models for digital preservation that are appropriate to their needs, as the chapters by Shepherd and McLeod and McDonald also emphasise. It then looks at the variety of models available for delivering a digital preservation service and concludes with a summary of the operational implications.

Elements of a digital preservation capability

There are many definitions of digital preservation, but that provided by the Digital Preservation Coalition is a good starting point:

Digital Preservation refers to the series of managed activities necessary to ensure continued access to digital materials for as long as necessary. [It]…refers to all of the actions required to maintain access to digital materials beyond the limits of media failure or technological and organisational change. Those materials may be records created during the day-to-day business of an organisation; “born-digital” materials created for a specific purpose (e.g. teaching resources); or the products of digitisation projects. This [definition] specifically excludes the potential use of digital technology to preserve the original artefacts through digitisation.1

In an archival context, we can refine this definition to encompass the concept of authenticity which, as set out in ISO 15489,2 depends on three essential characteristics:

•reliability: the record must be a full and accurate representation of the cultural or business activity to which it attests, for instance the management of finance, land or other resources. This depends upon trust in the regimes within which the record has been managed throughout its lifecycle and on the continuing ability to place it within its original context or the circumstances in which it was created. For digital archives, reliability is supported by adopting transparent and fully documented preservation strategies and processes, as well as by the existence of metadata (data describing the record’s content, its context and/ or the circumstances of its creation) and its provenance (or origin)

•integrity: the record must be protected against unauthorised or accidental alteration. For a digital archive, integrity is maintained through the process of bitstream preservation (see below), and through the existence of an audit trail for every action relating to the record’s management and preservation

•usability: the record must be capable of being accessed in meaningful form through time. It must therefore be discoverable and retrievable by authorised users, accessible, and interpretable within the current technical environment. Usability is ensured within a digital archive through the process of logical preservation (see below), and the existence of metadata sufficient to allow the record to be located, retrieved and interpreted.

Fundamentally, therefore, digital preservation involves acquiring authentic digital records, storing them and making them accessible to users for as long as they are required. Given this definition, we can begin to identify the requirements through which this goal can be achieved. At its most fundamental level, all digital information is encoded as a sequence of 1s and 0s – a bitstream – which may be written to a storage medium or transmitted across a network. These bitstreams have no intrinsic meaning but require layers of technology to transform them into meaningful information. Preserving meaningful digital information requires us to maintain both the underlying bitstream and the means to correctly interpret it. Digital preservation is, therefore, often considered to require two fundamental activities:

•bitstream preservation: the series of activities required to maintain the integrity of a bitstream, ensuring that a demonstrably bit-perfect copy can be retrieved on demand, for as long as required

•logical preservation: the series of activities required to ensure that bitstreams can continue to be rendered as meaningful information through time.

However, this is not a purely technical exercise. For any organisation to undertake digital preservation, in practice, it also needs broader capabilities, encompassing:

•organisational viability: the organisation must have the necessary organisational and governance structures in place, and commensurate resources to deliver a digital preservation service. It must also be able to demonstrate that it can expect to maintain that capability over the lifetime of the archives in its custody, or that it has a credible strategy to transfer those responsibilities to another organisation in future

•stakeholder management: it must be able to identify, understand and engage with its stakeholders, both within and beyond the organisation, including funders, service partners, content depositors and end users

•legal basis: the organisation must have an appropriate legal basis within which to operate, with the means to manage contracts, licensing and its other legal rights and responsibilities

•policy framework: the organisation must have suitable policies, strategies and procedures in place to govern its digital preservation operations. These should be subject to regular review

•acquisition and ingest: it must have the means to acquire and ingest (import) authentic digital content in accordance with a defined collections development policy, bringing it fully within its control

•metadata management: it must have the means to create and maintain all metadata required to support the management and use of that digital content, including preservation and reuse

•dissemination: it must provide the means for its designated user community to locate and use the digital content in its custody, in accordance with the applicable conditions of use

•infrastructure: the organisation must have access to the necessary physical and technical infrastructure to deliver the above capabilities, whether it owns and manages this directly or outsources it to a third-party provider.

While any organisation must have these capabilities in order to preserve digital content over the long term, digital preservation is not a one-size-fits-all enterprise – every organisation, large or small, whether well-funded or underfunded, is unique and will have its own requirements, opportunities and constraints. Any organisation wishing to develop the capability to preserve digital records through time must therefore develop an understanding of its current capability, design a blueprint for the future that meets its specific requirements, identify the gap between the latter and the former, and introduce a strategy for building the necessary capacity to bridge that gap.

Maturity models are well-established tools that can help an organisation assess its capabilities in a particular area against a benchmark standard as a means of articulating its current and desired future operating models. They typically describe organisational capacity in two dimensions: first, they define the component parts that together constitute the specific service or function to be addressed, and second, they define a scale for measuring capability against each of these requirements.

The key elements of digital preservation capability, including organisational factors such as governance, resources and policy, and technical or functional aspects, have been discussed above. In the language of maturity models, these are often referred to as ‘process perspectives’. A typical scale for measuring capability will comprise five or six steps, spanning the stages from developing awareness to building increasing capability, such as set out in Table 9.1.

Table 9.1.Maturity levels

Stage

Maturity step

Description

Awareness

0 – no awareness

The organisation has no awareness of either the need for the process or basic principles for applying it.

1 – awareness

The organisation is aware of the need to develop the process and understands basic principles.

2 – roadmap

The organisation has a defined roadmap for developing the process.

Capability

3 – basic process

The organisation has implemented a basic process for capturing and preserving digital records.

4 – managed process

The organisation has implemented a comprehensive, managed process that reacts to changing circumstances.

5 – optimised process

The organisation undertakes continuous process improvement, with proactive management.

These steps can be mapped to the records management capacity levels in Julie McLeod and Elizabeth Shepherd’s chapter, which also identifies the role and responsibilities of the major players involved. These also correspond with the maturity levels defined in John McDonald’s chapter.3 McDonald sets out capacity levels as:

•Level 1: poor-quality data, statistics and records make it virtually impossible to measure SDGs reliably

•Level 2: data, statistics and records are adequate to measure the SDGs to basic levels in some sectors

•Level 3: the quality of data, statistics and records makes it possible to measure SDGs effectively and supports government programme activities

•Level 4: well-managed data, statistics and records make it possible to measure SDG implementation effectively and consistently through time; data and statistics are of high enough quality to support government programme activities at the strategic level

•Level 5: processes generating data, statistics and records, and the framework for managing them are designed to make it possible to exploit data, statistics and records in new and innovative ways.

Achieving Level 4 will require a basic level of digital preservation capability, while Level 5 corresponds to the more advanced, managed and optimised capabilities. Taken together, we can begin to develop detailed definitions of each maturity level against every process perspective. To illustrate the principle, the ‘basic’ level of maturity (that is, Step 3) might be defined as shown in Table 9.2.

Table 9.2. A basic preservation capability

Process perspective

Step 3 definition

A – organisational viability

•staff have assigned responsibilities and the time to undertake them

•a suitable budget has been allocated

•staff development requirements have been identified and funded

B – stakeholder

engagement

•key stakeholders have been identified

•objectives and methods of communication have been identified

C – legal basis

•key legal rights and responsibilities, together with their owners, have been identified

D – policy framework

•a written, approved digital preservation policy exists

E – acquisition and ingest

•some individual tools are used to support accession and ingest

•an acquisition policy exists that defines the types of digital content that may be acquired

•a documented accession and ingest procedure exists, including basic guidance for depositors

F – bitstream preservation

•there is dedicated storage space on a network drive, workstation or removable media

•at least three copies are maintained of each object, with backup to removable media

•basic integrity checking is performed

•virus checking is performed

•existing access controls and security processes are applied

G – logical preservation

•basic characterisation capability exists (or the capability to identify and describe a file and its defining technical characteristics) allowing at least format identification, such as file formats and technical attributes

•ad hoc preservation planning takes place

•ad hoc preservation actions can be performed if required

•the ability to manage multiple manifestations of digital objects exists. That is, all of the different renderings of the same object (in different file formats) can be identified, described and managed

H – metadata management

•a documented minimum metadata requirement exists

•a consistent approach to organising data and metadata is implemented

•metadata is stored in a variety of forms using spreadsheets, text files or simple databases

•the capability exists to maintain persistent links between data and metadata

•persistent, unique identifiers are assigned and maintained for all digital objects

I – dissemination

•basic finding aids exist for all digital content

•users can view or download data and metadata, either online or on-site

J – infrastructure

•sufficient storage capacity is available, and plans exist to meet future storage needs

•IT systems are documented, supported and fit for purpose

This should be considered the minimum standard for any organisation to measure SDGs and provide a genuine digital preservation service. For many, this may be an achievable target, sufficient to meet their objectives. It should not be assumed that every organisation should strive to reach Step 5 in every process, which could be excessive and infeasible, particularly in lower resourced countries. The power of maturity models is that they make it possible to define nuanced and proportionate levels of service capability. In practice, most organisations will wish to define different target levels for different process perspectives.

In many cases, a relatively modest level may be entirely appropriate. The value of maturity models lies primarily in providing a means of thinking about digital preservation as a broad spectrum of capabilities, rather than a single, and almost-certainly unobtainable, ideal. This should help organisations to think about what ‘good enough’ preservation looks like in their own particular circumstances, and therefore to ensure a proportionate and appropriate level of investment.

The approach summarised here is intended to provide a flexible and realistic methodology.4 However, other maturity models for digital preservation exist. For example, the National Digital Stewardship Alliance has developed its own framework of levels of digital preservation,5 while the EU-funded E-ARK project has created a model which has been tested across a range of institutions.6 Most recently, the Digital Preservation Coalition has developed a Rapid Assessment Model based on that described in this essay.7 Whichever model is used, it is essential for any organisation that is planning to develop a digital preservation capability to have a clear and pragmatic definition of the level of capability it wishes to build. Having done so, it will then need to consider how to establish that capability in practice.

Implementation options

Many different models are possible for implementing a practical digital preservation capability.

This means that options are available to meet the needs of a wide variety of sizes and types of organisation in many different contexts with very different levels of resources at their disposal. The market for digital preservation solutions is still comparatively young, but its vast potential size (one study estimated its potential value in 2011 as in excess of $1 billion,8 and this figure has surely only increased in the intervening period) has encouraged a growing number of increasingly mature products to emerge to meet a wide range of requirements.

It is very important to recognise that digital preservation is achievable not only by large institutions with substantial budgets – practical solutions are possible with much more modest means. The main options are summarised below, with a brief discussion of their advantages and disadvantages. Mention of specific products or tools does not constitute an endorsement or recommendation.

Doing nothing

Any analysis of options should always include the status quo, as this allows a true comparison to be made with other, more positive, options, and it highlights the implications of not taking action. Doing nothing is not a cost-free option; actually, through time it is likely to be very expensive because of the continuing burden of maintaining archival data on inappropriate storage infrastructure and the actual or lost opportunity costs of having to recreate digital records that will inevitably be lost through inaction.

Without active management, data loss is inevitable over the longer term, and the associated costs are likely to be much higher than the costs of establishing appropriate control systems. The reactive, ad hoc rescue of digital information will almost inevitably be much more expensive than proactive management. Doing nothing will also, of course, incur significant risks and will not meet the requirement of maintaining access to vital records. It will have major consequences for the delivery of government programmes and for the ability to measure the SDGs reliably.

Using open source software

It is possible to develop a digital preservation solution entirely from open source tools. A number of complete open source digital repository management systems are available, although they vary in the level of preservation functionality that they offer directly. The most widely used systems include Archivematica,9 DSpace,10 EPrints,11 Fedora,12 LOCKSS13 and RODA.14 In addition to these complete systems, a number of toolkits and individual utilities have been developed that can be used to add preservation functionality to existing repository systems. These include characterisation tools,15 such as DROID16 and JHOVE,17 forensic tools, including BitCurator,18 and web archiving tools, such as Heritrix.19

Open source solutions offer a number of attractions. They are often free to use, may have thriving user and developer communities, and can offer very flexible solutions. The fact that users have complete access to, and control over, the source code, can also be very attractive from a sustainability perspective. However, open source solutions are not cost-free: resources are likely to be required to adapt and configure the software, either from in-house staff or procured from a third party. Also, the organisation will have to bear all the risks, rather than sharing them with suppliers.

It is possible to develop a functioning digital archive capacity very affordably, by using simple, cheap and readily available open source tools and existing infrastructure. Compromises may be needed to keep costs down – limited integration will involve a lot of manual or semi-manual processes, and this approach probably won’t be appropriate for managing larger volumes of records or high numbers of users. It may also be difficult to support or sustain as a service over time. Nonetheless, for many smaller organisations or those with limited resources, this may be a good starting point. Investment in limited configuration of these tools to meet local needs, and with good support arrangements, will be very worthwhile, and may compare favourably with the costs of commercial products.

Developing a bespoke solution

Developing a bespoke, or tailor-made, solution is an option, especially for organisations with significant in-house software development capacity, or with the resources to commission external developers. A bespoke solution can be developed to meet the organisation’s specific requirements, but software development is an expensive, complex, uncertain and time-consuming option. Given the availability of a number of mature third-party solutions, this is unlikely to be an economical approach except for organisations with very unique requirements and substantial resources.

Procuring a commercial solution

Commercial off-the-shelf (COTS) digital preservation solutions are a common approach, thanks to a growing and comparatively mature market for such products. They typically command a relatively high price, including one-off licence fees, annual support costs and potentially expensive customisation and configuration. They can also create a degree of dependency on an external supplier, and on proprietary software. At the same time, they can offer a high level of flexibility and support, usually have well-established user communities, and can therefore provide a comparatively low-risk approach. Building a strong relationship with the supplier is often vital.

There is a small, but now well-established, community of commercial digital preservation products that caters not only to the library, archive, museum and gallery sector but increasingly to customers in the private sector with long-term data retention requirements. It includes industries such as banking, pharmaceuticals, aerospace, gas and oil exploration, government and the nuclear energy sector. Examples of commercial products include Preservica20 and Rosetta.21

Outsourcing the service

One of the most recent developments in the digital preservation market has been the growth of preservation-as-a-service, whereby third-party suppliers provide an end-to-end solution hosted on their own infrastructure. This option minimises the direct impact on the organisation, including the need to host and support significant infrastructure, and can be easily scaled up or down to meet demand. It also has a very low barrier to entry, making it particularly attractive to smaller organisations or those with limited resources. These services are typically priced on the basis of actual usage, removing the need for significant up-front capital investment, which can make them financially attractive. However, it is always important to consider the long-term costs when comparing the economics of different models.

A growing number of suppliers offer a full range of digital repository services. In some cases, these have emerged to meet the needs of specific communities. For example, the UK Data Archive (UKDA)22 and the Inter-University Consortium for Political and Social Research (ICPSR)23 both provide data archive services to the international social sciences community. The international library community is served by services such as Portico, which preserves e-journals, e-books and digitised historical collections on behalf of publishers and libraries,24 and by the OCLC CONTENTdm digital collections management service.25 A number of cloud-based commercial services have also emerged, such as ArchivesDirect,26 DuraCloud,27 Preservica Cloud Edition,28 while services are also available for specific functions such as web archiving, such as Archive-It.29

Partnership approaches

A very collaborative community has developed around digital preservation. This can include the actual provision of services, with a group of organisations that share a common set of requirements establishing a partnership to develop and share services. Given the long timeframes involved and high level of confidence required, such arrangements are generally formalised in some way, whether through a contract, consortium agreement, or less legally binding instruments such as memoranda of understanding or letters of agreement. A partnership may even be set up as a separate legal entity representing the shared interests of the partners. This might be a non-profit entity, such as a charity, trust, foundation or private company limited by guarantee.

The precise forms of non-profit organisation allowed vary from country to country, but most enjoy tax-exempt status. For instance, an existing legal entity might host a partnership entity that exists independently from its members but is not itself legally constituted. Partnerships can establish their own shared infrastructure, which may use a distributed model such as LOCKSS (see above) to jointly procure a service from a third party, or they can designate one partner to operate the service on behalf of the others.

A partnership approach can offer significant economies of scale and allow better deals to be negotiated with suppliers than individual partners could achieve on their own. However, they depend on a strong and ongoing alignment of objectives between partners that can be complex to establish and require compromise. There are some excellent examples of the partnership model in practice, many based on LOCKSS technology. These include CLOCKSS (Controlled LOCKSS), an international, not-for-profit community partnership between libraries and publishers using a private network based on LOCKSS technology to provide a distributed archive for electronic scholarly content;30 the Alabama Digital Preservation Network (ADPNet), which provides a low-cost distributed LOCKSS-based digital preservation service for cultural heritage organisations in Alabama;31 and the MetaArchive Cooperative, a community-based digital repository solution, serving over 20 libraries, archives and other cultural memory institutions in three countries.32

In the context of SDGs, there are obvious opportunities for partnerships, both between national archives and national statistical agencies and between countries. Such partnerships could facilitate the establishment of shared digital preservation capabilities.

Hybrid approaches

It is always possible to adopt more than one of these approaches for different elements of the solution. Such a hybrid approach can be very flexible and cost-effective, but it may also increase complexity.

Using consultancy services

Consultancy can provide important support for organisations in understanding their preservation needs. Typically, this may involve specific projects, for example auditing existing holdings, defining requirements, developing policies and procedures, or advising on standards. To benefit from a consultancy – which can be costly – it is essential to have a clear, focused brief and to choose both the project and the consultant with great care. However, at its best such consultancy can bring an impartial and expert perspective to issues that must ultimately be implemented internally.

Implementation and operational implications

Implementing a digital preservation service

The task of establishing facilities for managing and preserving digital records and data can vary enormously, depending on the complexity of the requirements and the approach taken. In some cases, this will involve a major technical and organisational change programme; for others it will be much more straightforward. However, in all cases, success will depend on careful planning and organisation. The steps involved, from galvanising organisational support to delivering an operational service, have been described in detail elsewhere,33 and are summarised briefly here.

No initiative of such importance can succeed without first securing the organisation’s support, not only in terms of allocating the necessary resources but also of achieving the cultural change that is essential to achieving the benefits of digital preservation and ensuring its long-term viability. In turn, such support can only be achieved by first understanding the specific concerns that will trigger action for a particular organisation or in a specific context. This must also be rooted from the start in an understanding of the needs of all relevant stakeholders, from data creators to those who will use the information now and in the future. An early objective should be to develop and agree a digital preservation policy and strategy, including a commitment to regularly review and update them to address evolving issues. These policies will help to build an institutional commitment, laying the foundations for developing a detailed business case to secure the time and resources needed to make that commitment a reality.

There must be a convincing argument for the long-term value of digital records and data, based on evidence such as case studies and data about, for example, the economic or societal value of specific information or data, and the costs that would be incurred if it were to be lost. In the case of SDGs there is ample evidence that without reliable information, the goals cannot be measured reliably. It also will be important to consider the various available options, such as those discussed above, to identify the most appropriate and economically achievable approach, taking account of the benefits and risks, and any dependencies on other projects or activities. It will take time to create and gain approval for such a business case, but the reward will be a practical and achievable path to developing a digital preservation capability that is appropriate and proportionate to the need. The Digital Preservation Coalition has developed a toolkit that is thoroughly recommended for anyone creating a business case.34

Once the case has been approved, the next step will be to define detailed requirements, which again must be firmly rooted in the needs of stakeholders, including content creators, information managers and those with curatorial responsibilities, end users, decision-makers and funders, and technology suppliers. All key business processes will need to be modelled, showing, for instance, how information is used in the organisation. Having a comprehensive requirements catalogue is one of the most important building blocks for developing a successful digital preservation capability. A wide range of organisations have completed similar exercises. The digital preservation community is very open and collaboration-minded, and many excellent examples are available to draw upon.

The requirements catalogue should inform the design and implementation of the solution, whether this be through a commercial procurement exercise, building a solution in-house, or some form of partnership. The solution should cover all the key digital preservation functions, as outlined above, from acquisition and ingest, metadata management, bitstream and logical preservation, and dissemination. Implementation will always involve designing, building and testing both the technologies and the procedures for using them, often through a number of iterations. In parallel, the other aspects of the service will need to be put in place, as illustrated below.

Governance

Organisational and governance structures needed to deliver the service must be established, including staffing, which is discussed separately below. This must include legacy planning in the event of organisational change.

Roles and responsibilities

While the precise types and numbers of staff required to operate a digital repository will vary considerably, a number of generic roles are likely to be needed, including:

•repository manager, to oversee all key digital preservation functions, including ingest, preservation and access. This will require suitably trained curatorial staff, or a specialist digital archivist

•ingester, to manage the accession and ingest into the digital repository of individual records from start to finish, including liaising with depositors. This will usually be performed by librarians or archivists with suitable training. In smaller operations, it might be performed by the repository manager

•cataloguer, to ensure that descriptive metadata is created and captured to appropriate standards, either during or after ingest. This role will usually be undertaken by existing cataloguing staff. In some cases, it may be combined with the ingester role

•system support for users, which is often referred to as first-line support (more complex issues may be referred to as second-line support provided by system administrators or suppliers). Where possible, this should be integrated with existing IT helpdesk support. In a small organisation, it might instead be combined with the repository manager role

•system administrators, to manage IT systems and infrastructure, including second-line support, database administration, managing storage and managing user accounts. This role will normally be performed by IT staff

In all cases, some redundancy is highly desirable, to avoid any single-points-of-failure. If a failure does occur in a digital preservation process, other processes (duplicate or otherwise) can negate the impact of the failed process.

Training

Although digital preservation draws on a wide range of skills, including many that will already be part of the core expertise of librarians, archivists and other information management professionals, it is a specialism in its own right and therefore needs to be supported by high-quality training in digital preservation theory and practice. Digital preservation awareness and expertise is increasingly recognised as being a core part of information management professionals’ core skill set, and it is now explicitly addressed within relevant graduate and postgraduate training courses, particularly in well-resourced countries, where most library and archives courses now cover digital preservation in some depth.

Training is, of course, also essential for existing staff who need to develop new expertise. A number of established face-to-face and online training courses are available and are highly recommended for anyone seeking to develop practical skills and knowledge. Notable examples of online training and related resources that are suitable for and accessible to an international audience (albeit primarily in English only), include:

•MIT Digital Preservation Management Tutorial: currently provided by MIT Libraries, but based on the seminal training programme developed by Cornell University and subsequently ICPSR. This is an award-winning online tutorial [English, French and Italian]35

•‘Digital Preservation in a Box’: an online training toolkit developed by the US National Digital Stewardship Alliance’s Outreach Working Group, which provides a portal to a wide range of training resources [English only]36

•Digital Preservation Coalition Handbook: although not a training course, the DPC handbook is a wonderful source of guidance and good practice, as well as links to further resources [English only]37

•ICA Digital Preservation resources: the International Council on Archives, in partnership with the InterPARES project has developed an online educational initiative called ‘Digital records pathways: topics in digital preservation’ [English only].38 Separately, and in conjunction with the International Records Management Trust, it has published two online training modules [English only]39

Policies and procedures

All the policies and procedures necessary to underpin the service need to be developed and then maintained. These should cover issues such as collections development, collections management, documentation, preservation, business continuity and access.

Conclusion

Preserving digital information is essential both for programme management and for measuring the SDGs. It requires sustained active management, which in turn necessitates specialised skills and technologies, as well as institutional will to establish and maintain new capabilities. The operational demands of digital preservation, for instance setting up governance structures and determining training needs, are just as important to consider as the technical ones. The long-term success of any service will depend on both being met appropriately.

Digital preservation is, and will continue to be, a growing challenge for organisations across the world, but it is a challenge that can and must be addressed. Practical solutions are available to suit a wide range of needs and are achievable by organisations of all sizes and with widely varying resources at their disposal. Many different approaches to implementation are possible, and it is essential to carefully choose the right option to suit the needs of the organisation. For organisations in lower resource environments, the open source, preservation-as-a-service or partnership models may offer the most practical way forward. The need for digital preservation must be addressed if SDGs are to be met-careful planning for this fundamental aspect of global development from the outset will have wide benefits for transparency and accountability.

1Digital Preservation Coalition, Digital Preservation Handbook, 2nd edn (2015), http://dpconline.org/handbook.

2ISO 15489-1: 2016 – Information and Documentation – Records Management – Part 1: Concepts and Principles.

3John McDonald, ‘The quality of data, statistics and records used to measure progress towards achieving the SDGs: a fictional situation analysis’, Chapter 13 in this volume.

4Based on the maturity model defined in A. Brown, Practical Digital Preservation (London: Facet, 2013), pp. 86–91.

5See https://ndsa.org//activities/levels-of-digital-preservation/.

6See http://www.eark-project.com/resources/project-deliverables/95-d75-1/file.

7See https://www.dpconline.org/our-work/dpc-ram.

8Y. Au, R. Kandalaft, M. Kuang and S. Nair, Digital Preservation and Long-Term Access Functionality (Cambridge: Judge Business School, 2010), http://www.scribd.com/doc/45412331/Cambridge-Judge-Business-School-Market-Research-Digital-Preservation.

9See http://www.archivematica.org/en/.

10See https://duraspace.org/dspace/.

11See http://www.eprints.org/uk/.

12See https://duraspace.org/fedora/.

13See http://www.lockss.org/.

14See https://demo.roda-community.org/#welcome.

15Characterisation tools (together with format identification tools) aim to automate the process of identifying the format of a digital object based on its extrinsic and intrinsic elements and by extracting metadata about its properties that are significant to its ongoing preservation.

16See http://www.nationalarchives.gov.uk/information-management/manage-information/policy-process/digital-continuity/file-profiling-tool-droid/.

17See http://jhove.openpreservation.org/.

18See http://bitcurator.net/.

19See https://github.com/internetarchive/heritrix3/wiki.

20See https://preservica.com/.

21See http://www.exlibrisgroup.com/products/rosetta-digital-asset-management-and-preservation/.

22See http://www.data-archive.ac.uk/.

23See http://www.icpsr.umich.edu/icpsrweb/.

24See http://www.portico.org/digital-preservation/.

25See http://www.oclc.org/en/contentdm.html.

26See https://duraspace.org/archivesdirect/.

27See http://www.duracloud.org/.

28See https://preservica.com/digital-archive-software/products-editions.

29See http://www.archive-it.org/.

30See https://clockss.org/.

31See http://www.adpn.org/.

32See https://metaarchive.org/.

33See, e.g., Brown, Practical Digital Preservation.

34See http://wiki.dpconline.org/index.php?title=Digital_Preservation_Business_Case_Toolkit.

35See https://dpworkshop.org.

36See https://wiki.diglib.org/index.php/NDSA:Digital_Preservation_in_a_Box.

37See https://dpconline.org/handbook.

38See http://www.ica-sae.org.

39See http://www.ica.org/en/digital-preservation-training-modules.

Annotate

Next Chapter
A Matter of Trust
PreviousNext
© authors 2020
Powered by Manifold Scholarship. Learn more at
Opens in new tab or windowmanifoldapp.org