• Register

What are the reasons for saving disk images?

+4 votes
1,250 views
Why do digital curators save disk images of legacy
media, as opposed to just copying the file structure of the files on the media?

Originally asked on the Digital Curation Google Group.
asked Mar 7, 2014 by JMandelbaum (290 points)

4 Answers

+2 votes
There are some great reasons why disk imaging is a good default approach. I've heard several folks working with digital materials in archives talk about disk imaging as a kind of initaal preservation action. You are in effect stabalizing the artifact. That is, you have legacy media (CD-Rs, Floppies, Flash Drives, Hard Drives etc.) and that media is deteroriating. The more lower level the image of the disk you create (that is a forensic image) the more closer you are to having a copy of "the real thing." This is particularly important in contexts where you are aquiring digital media as part of a personal papers collection, where you are likely to get a wide range of different kinds of media from different times with different sorts of files and you want to leave open as many possible uses as posible for the use of the materials.

With that said, if you are working in a area where the significant properties or objectives of a collection are much more focused, there are good arguments for eshewing disk imaging in favor of extracting the content on media according to it's connection to your mission. For example, an oral history project recieving videos and audio recordings of interviews could well make the decision that it is the video and audio files on the media that they are interested in capturing and that they are not in the buisness of mantining a collection of imaged DVD-Rs and CD-Rs. Sure, they will lose the ability to give their users a sense of what hte DVD menu might have looked like on those DVDs, but they stand to gain a much more normalized collection of discrete A/V media files that are ostensibly eaiser to work with and ensure long term access too.
answered Mar 27, 2014 by tjowens (2,360 points)
+1 vote

There are two less-obvious reasons for creating disk images as part of a curation process:

1. It can be useful to create disk images/capture the whole content of the disk when you are not sure what type of disk you have (e.g. how many tracks, what sector size, etc). 

Using devices like the kryoflux  you can make a set of "stream" files that capture the magnetic flux of a disk. You can then run tools over the stream files to try to identify what type of disk it was and process the stream files into a disk image from which you can then extract the files, or process the stream directly and extract the files from it. 

2. It can also be useful to create disk images when you don't have time to process the individual files on the disks. As Trevor says in the other current answer to this question, by making a disk image you can (at least) stabilize the bits and ensure that there is no further data loss due to degradation of the media. Archival workflows might be a good example of where this could be a pragmatic approach (i.e. workflows in which volume is a significant factor). 

answered May 21, 2014 by euanc (3,930 points)
0 votes
There are some great answers to this question already, but I thought I'd throw in a quick thought as well...

A proper disk image is going to save silly amounts of metadata.  It could tell researchers about lots of things -- if you have a collection that could be of interest for digital humanities research or researchers would want to put under intense scrutiny, a disk image is an easy way to provide that context.  Its a double-edged sword, however; if your donor saved private information somewhere, it becomes available in that disk image.
answered Jun 27, 2014 by sarah.barsness (1,250 points)
0 votes
Images that capture the entire data area of a storage media, sometimes also including unused blocks, boot blocks, file system periphernalia and other metadata areas can provide detailed peeks into the data that might otherwise be lost.

In some cases, such as old operating systems and early personal computer games in particular, copying the entire media may be the only way to ensure full preservation of the data, let alone the environment in which it lived. In other cases, it can later on provide interesting insights that nobody noticed and which would have been lost had the entire media not been imaged.

Imaging the media also, as tjowens points out, allows you to arrest degredation of the physical media itself. It also provides a nice alternative to storing a potentially large number of different physical media formats, including the equipment required to access those media artefacts. It's one thing to do something like access content stored on microfilm; it's quite another to be able to do something useful with that 8" floppy disk someone found in their father's attic; but from a preservation perspective, the data stored on the 8" floppy disk may very well have lasting value. If you can create a complete image of the floppy and store that on media that is more easily handled and migrated from if necessary, in many situations it might remove or at least ease the requirement of storing the physical artefact, and it certainly removes the need to handle the physical artefact in daily use.
answered Jul 15, 2014 by michael (400 points)
...