Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The Collection and Born-Digital Series objects themselves are created first, ahead of FTK processing.  All FTK processed materials for a collection are processed together and are members of the Born-Digial Series set. Media objects must be linked to the appropriate series via an isMemberOf relationship.

Media (e.g. Disk Image) objects

The FTK processing must first create a set of media objects representing the physical media (hard drive, diskette, etc) on which the files were found. This has been described as a view of the "unprocessed" collection, meaning it has not been processed down to the individual units of content, the separate files. 

...

Information from: FTK xml // Report_transformed.xml

maps to (within item objects)                                  

notes

<filename>BU3A5</filename>

n/a
<identifier type="filename">BU3A5</identifier>

this is the original file name as it appeared on the original media.

<Item_Number>1004</Item_Number>

n/a
<identifier type="ftk id">1004</identifier>

internal FTK reference only, to disambiguate references in the FTK report

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="0b4b7c3871555095-4eaa6975-4ac14037-9567b913-9358b9ef38026198dc8ff69f"><ac:plain-text-body><![CDATA[

<filepath>CM006.001/NONAME [FAT12]/[root]/BU3A5</filepath>

 

location of file on original media
]]></ac:plain-text-body></ac:structured-macro>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="9fb6216f362556b9-4adf9b78-4d8f41ff-9a238913-425cb75ade85238537e28f02"><ac:plain-text-body><![CDATA[everything after [root] can be taken as the fully qualified filename

]]></ac:plain-text-body></ac:structured-macro>

<disk_image_no>CM006</disk_image_no>

descMetadata
   <mods:location> (1)

This token, taken from the head of the <filepath>, is the only data link between the FTK output for a file object and the corresponding media object. We want a data link in descriptive metadata as well as an RDF link to the corresponding object.

<filesize>35654</filesize>

<physicalDescription>
  <extent>35654 B</extent>
</physicalDescription>

Could be used by conversion to compare against the file size as computed locally, a quick check prior to checksum validation?

<filesize_unit>B</filesize_unit>

 

Needed to correctly interpret <filesize>, if used

<file_creation_date>n/a</file_creation_date>

<originInfo>
  <dateCreated>n/a</dateCreated>
</originInfo>

 

<file_accessed_date>n/a</file_accessed_date>

<originInfo>
  <dateOther>n/a</dateOther>
<originInfo>

 

<file_modified_date>12/8/1988 6:48:48 AM (1988-12-08 14:48:48 UTC)</file_modified_date>

<originInfo>
  <dateModified>
12/8/1988 6:48:48 AM (1988-12-08 14:48:48 UTC)
  <\dateModified>
<originInfo>

 

<MD5_Hash>976EDB782AE48FE0A84761BB608B1880</MD5_Hash>

 

Used for checksum validation of a file during processing. This value will eventually be part of contentMetadata, but probably not as a value transferred from here.

<restricted>False</Restricted>

 

true=visible staff only, not discoverable .... Hypatia only

<type>Books</type>

descMetadata
 <typeOfResource>sound recording-nonmusical</typeOfResource>

<typeOfResource>? <topic? or <genre>?  authority?

<title>The Burgess Shale and the Nature of History</title>

descMetadata
   <relatedItem displayLabel="Appears in" type="host">
      <titleInfo>
         <title>The Burgess Shale and the Nature of History</title>

This is not the title of the file or the file content directly, but the author's title to which the file relates.

<filetype>WordPerfect 4.2</filetype>

descMetadata
   <mods:note displayLabel="File type">

 

<Duplicate_File> </Duplicate_File>

 

* blank, null value or empty string - file is unique in collection, no duplicates
* "M" - The main file in a duplicate relationship. Neither better nor worse than the duplicate file, but simply the file examined first.
* "D" - indicates a duplicate file.

Note that this is content duplication based on having the same checksum (name conflicts are different and handled another way). The two files may or may not have the same name.  It is desirable to have a note and/or relationship in each record indicating the presence of a duplicate file in the collection. Details tbd.

<export_path>files\BU3A5.wp</export_path>

 

The file as saved by FTK for further processing.

(implied)

RELS-EXT
   isMemberOf

A link to the Media object

...