• Register

Ingesting large & hybrid digital collections

0 votes
435 views
Dear all,
 
For our web archaeology project The Digital City Revives [1], we are looking for answers to a complex set of questions. Over the years, we have gathered a considerable amount of heterogeneous files related to the original websites. Our next question is what would be a logical way to keep this material together and manageable. 
 
In order to better understand the necessary approach we are looking for concrete examples and use cases. We’re looking for descriptions of the ways archives or museums have ingested heterogeneous digital collections. In hopes of finding an answer to questions such as:
 
- to what extent do the provenance and original order of the materials in the state we found them in, matter?
- what descriptions are out there of organisations that have ingested this type of heterogeneougs (dare we say, messy) collections?
- to what extent do these organisations describe the various contents and at what (unit) level do they do it?
- how do these organizations inventorise the dependencies for this variety of materials?
 
If you’d be able to point us to some concrete examples (or, potentially, references) from the top of your head, we’d already be grateful. With warm regards and many thanks in advance for your help!
 
Erwin Verbruggen
Netherlands Institute for Sound and Vision
 
asked May 18, 2017 by erwinverb (150 points)

1 Answer

+1 vote

My institution has some experience with larger, heterogeneous collections, though our local practice is still evolving as we add larger and more complex collections to our holdings.  I only have one link I can send your way as an example -- a collection of digital files relating to a redistricting project: http://www2.mnhs.org/library/findaids/gr00558.xml.  This finding aid dates back to 2012 and is already a bit outdated for what we plan to do with new/incoming collections, but I think it gives you the general idea.

As you can see from that example, we (generally) rely heavily on original order, and use our descriptive tools (generally EAD finding aids) to try and aid researchers in finding what they're looking for.  When there are dependencies between multiple digital objects, we keep them all together in a single .zip download rather than separate them out.  The level of description varies in our collections; we previously described every file (as in the above example), but with newer and more homogeneous collections (such as large collections of digital photographs) we tend to describe them at the series level.

I hope that provides some help!

answered May 22, 2017 by sarah.barsness (1,230 points)
...