• Register

Best SIP / AIP creation practices for optical carriers that span multiple volumes

0 votes
427 views

I’m trying to figure out a general SIP/AIP architecture for optical media images. For CD-ROMs the images will typically be ISO9660 files, and for audio CDs audio files for each track.

It’s pretty common to have carriers that span multiple volumes, and from an access point I think it would make sense to combine those cases into a composite SIP/AIP. Following a discussion I had about this with a colleague, I’m curious how other institutions are handling this, and if there are any best practices. Below some information I found myself.

Library of Congress Audio Compact Disc METS Profile:

http://www.loc.gov/standards/mets/profiles/00000007.html

This explicitly takes into account the possibility of composite objects:

The primary physical component of a compactDiscObject is one or more compact discs. (…) When there is more than one disc, the div TYPE=“cd:disc” elements must occur in document order that corresponds to the physical order of the discs (Disk 1, Disk 2, etc.).

The virtual CD-ROM and floppy collection of Indiana University seems take a similar approach, judging by e.g. this METS file:

http://webapp1.dlib.indiana.edu/virtual_disk_library/index.cgi/4252478/mets

An alternative approach would be to represent each carrier as one SIP / AIP, and then aggregate the AIPs that belong together using what the OAIS model refers to as an Archival Information Collection (AIC). However I’m not sure how widely used this approach is, and I’m slightly worried it might later complicate things on the access side. But I’d be interested to hear what others are doing.

asked Jun 22, 2016 by johanvanderknijff (2,060 points)

1 Answer

+2 votes
In our audio digitization projects, it's common to have a single intellectual object span multiple pieces of media; for example, a symphony recorded across the front and backs of a vinyl record. Each face of the record is captured to its own preservation master file. In addition, we create an edit master from each preservation master for accessibility reasons.

In pseudo-JSON it looks like this.

symphony1:{
 record1: {
  face1:{record1_face1_pm.wav, record1_face1_em.wav},
  face2:{record1_face2_pm.wav, record1_face2_em.wav}
 },
 record2: {
  face1:{record2_face1_pm.wav, record2_face1_em.wav},
  face2:{record2_face2_pm.wav, record2_face2_em.wav}
 }
}

In our digitization workflows we ask for the files from each piece of media to be stored in its own bag. In the example, we would receive 2 bags (record1 and record2) that we treat as SIPs. Our repository is Fedora based, so each file set (preservation master, edit master, other derivatives) is stored as one Fedora object, and larger intellectual objects are aggregated through additional Fedora objects. I'm still mapping how this structure fits the OAIS AIP and AIC concepts.
answered Jun 22, 2016 by nkrabben (1,990 points)
...