Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Section
Column
width50%

Stanford directory output for Gould collection contains the EAD and the content and metadata files for both Media and file objects:

  • M1437 Gould
    • Computer Media Photo
      • CM001.jpg
      • (etc)
    • Disk Image
      • CM001.001
      • CM001.001.csv
      • CM001.001.txt
      • (etc)
    • Display Derivatives
      • {filename}.htm
    • EAD
    • FTK xml
      • files
        • {filename}
      • Report.fo
      • Report.xml
      • Report_transformed.xml
      • Disk Image

Note that the first 2 directories map to objects describing the physical media and will be the source of creating the "unprocessed" collection, while DIsplay Derivatives and FTK files  map to individual file content & description and will be used to create the "processed" collection.

Column
width50%

The Import/conversion process will produce this arrangement of objects in DOR:

  • Collection object
    • Series set -- Series 1 ..."
    •    :
    • Series set -- "Series 6: Born Digital Materials"
      • Media object 1
      • Media object 2
      •    :
      • File object 1.1
      • File object 1.2
      •    :
      • File object 2.1
      • File object 2.2
      •    :

Note that even though the files originate on specific media, the "Media" objects are not sets in the DOR/Hydra sense of simple object aggregation. Instead, the file/media relationship is considered just one of many possible intellectual arrangements that can be expressed in metadata. However, a specific A RELS-EXT relationship (hydra:isLocatedOn, onSourceMedia???) and a MODS <location> (for humans) express the file and media relationship, allowing this logical view:

  • Series set -- "Series 6: Born Digital Materials"
    • Media object 1
      • File object 1
      • File object 2
      •    :
    • Media object 2
    •    :

A ... while a simple <type> that maps to descriptive metadata designates the primary intellectual arrangement of the files based on the nature of their content.

...

Information from: FTK xml // Report_transformed.xml

maps to (all within item objects)

notes

<filename>BU3A5</filename>

n/a

this is the original file name as it appeared on the original media.

<Item_Number>1004</Item_Number>

n/a

internal reference only, to disambiguate reference in the FTK report

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="014767a588dd07f0-96a733f6-4a7249ce-ae22a610-823c36f060ef09cf61fd5598"><ac:plain-text-body><![CDATA[

<filepath>CM006.001/NONAME [FAT12]/[root]/BU3A5</filepath>

 

original file in FTK xml // files
]]></ac:plain-text-body></ac:structured-macro>
display derivatives in Display Derivatives named using <item_number>

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="ccaebe7d2afa6e90-f5fc8e27-49e74b99-af29bdbe-6e5859b809eab918b4cc6297"><ac:plain-text-body><![CDATA[take object filepath for fully qualified object filename from portion after [root], up to but not including the final filename token

]]></ac:plain-text-body></ac:structured-macro>

<disk_image_no>CM006</disk_image_no>

descMetadata
   <mods:location> (1)

This token, taken from the head of the <filepath>, is the only data link between the FTK output for a file object and the corresponding media object. We want a data link in descriptive metadata as well as an RDF link to the corresponding object.

<filesize>35654</filesize>

 

Could be used by conversion to compare against the file size as computed locally, a quick check prior to checksum validation?

<filesize_unit>B</filesize_unit>

 

Needed to correctly interpret <filesize>, if used

<file_creation_date>n/a</file_creation_date>

note?

 

<file_accessed_date>n/a</file_accessed_date>

note?

 

<file_modified_date>12/8/1988 6:48:48 AM (1988-12-08 14:48:48 UTC)</file_modified_date>

note?

 

<MD5_Hash>976EDB782AE48FE0A84761BB608B1880</MD5_Hash>

 

Used for checksum validation of a file during processing. This value will eventually be part of contentMetadata

<restricted>False</Restricted>

 

true=visible staff only, not discoverable .... Hypatia only

<medium>5.25 inch Floppy Disks</medium>

 

Part of <location> (1)

<type>Books</type>

descMetadata
   <mods:subject>
      <mods:topic>

<topic? or <genre>?  authority?

<title>The Burgess Shale and the Nature of History</title>

descMetadata
   <mods:title>

 

<filetype>WordPerfect 4.2</filetype>

note?

 

<Duplicate_File> </Duplicate_File>

 

* blank, null value or empty string - original file, not a duplicate
* "Primary" - possibly indicates Primary file to keep/store
* "Secondary" - indicates a duplicate file to be ignored
--> ignore for now

<export_path>files\BU3A5.wp</export_path>

 

The file as available for the DOR object. Note it may have a file extension added by FTK.

...