PLEDGE AIP Prototype
Warning |
---|
This page & all code described on it is now OBSOLETE. It has been replaced by the AipBackupRestore feature, which will first be released in DSpace 1.7.0 |
This page describes a prototype AIP implementation planned as part of the PLEDGE project. Since the PLEDGE project only needs AIPs to replicate them under the direction of a policy engine, it was not necessary to create an AIP-based asset store.
...
mets
element @PROFILE
fixed value="http://www.dspace.org/schema/aip/1.0/mets.xsd" (this is how we identify an AIP manifest)@OBJID
URN-format persistent identifier (Handle) if available, or else a unique identifier.@LABEL
title if available@TYPE
DSpace object type, one of "DSpace ITEM", "DSpace COLLECTION", "DSpace COMMUNITY".@ID
is a globally unique identifier, such as dspace67075091976862014717971209717749394363
. Wiki Markup |
<span style="color: red">@IDs should be used wherever available, @IDs should be used wherever available, I'll put a note about forming IDs in the profile spec.</span>
\
[OK, but how unique does the ID have to be, just within the document or amongst all other AIP documentes? --lcs\]
<span style="color: red">I ]
I can't imagine a scenario where we would reference an ID within a METS document from without that document, except for perhaps the <code>mets@ID</code>. I would say the <code>mets@ID</code> should be unique amongst all AIP documents, but the other IDs should just be unique within the document.</span> mets@ID
. I would say the mets@ID
should be unique amongst all AIP documents, but the other IDs should just be unique within the document.
mets/metsHdr
element @CREATEDATE
timestamp that AIP was created.@LASTMODDATE
last-modified date on Item, or nothing for other objects. - mets defines these attributes as describing the METS document itself, we use them to describe the AIP, which sometimes we think of as the METS document, but more often think of as the 'package' – i.e. the METS document and all the files. I don't have a problem with the use Larry put forth, but we need to mention it in a prolife. I wonder if these dates shouldn't rather be in a techMD section, or maybe both.
agent
element: @ROLE
= "CUSTODIAN",@TYPE
= "OTHER",@OTHERTYPE
= "DSpace Archive",name
= Site handle.
mets/dmdSec
element - object's descriptive metadata crosswalked to MODS (or whatever the METS default is)
- See link to RW's Comments Page below for notes on use of MODS
- object's descriptive metadata in DSpace native DIM intermediate format, to serve as a complete and precise record for restoration or ingestion into another DSpace.
- We should require mets/dmdSec@OTHERMDTYPE if @MDTYPE = "OTHER"
- When the
mdWrap
@TYPE
value is OTHER
, the element MUST include a value for the @OTHERTYPE
attribute which names the crosswalk that produced (or interprets) that metadata, e.g. AIP-TECHMD
.
mets/amdSec
element - admin (technical, source, rights, and provenance) metadata for the entire archival object. rightsMD
elements of the following TYPEs: DSpaceDepositLicense
if the object has a deposit license, it is contained here.CreativeCommonsRDF
If the object is an Item with a Creative Commons license expressed in RDF, it is included here.CreativeCommonsText
If the object is an Item with a Creative Commons license in plain text, it is included here.
sourceMD
elements - recorded twice, once in DSpace native format, once in PREMIS:
NOTE: PREMIS is only implemented for Bitstreams at the moment, and for the forseeable future. - DSpace native format: MDTYPE="OTHER" OTHERMDTYPE="
AIP-TECHMD
" (see Crosswalks section below for details'') - PREMIS expression of this technical metadata for archival object. (To be done later.)
unmigrated-wiki-markup- RW:Comment \ [AIP Object (Item-Collection-Community)-specific Metadata in PREMIS\] To see an example of the PREMIS version of this metadata, SEE link to RW Comments section page below
digiprovMD
- When History data is available, includes a section of
TYPE="DSpaceHistory"
containing an RDF/XML rendition of the history data for the object. For internal AIPs, the history is stored in an external bitstream in the asset store; for self-contained packages it is a file in the package.
mets/amdSec
elements - technical metadata for each of an Items's Bitstreams, both in PREMIS and DIM formats techMD
element - PREMIS technical metadata, expanded from SIP, for each of an Item's Bitstreams.sourceMD
element, type is AIP-TECHMD. - Bitstream-specific metadata not all of which is explicitly encoded in PREMIS, i.e.
name
(dc.title
)description
(dc.descripton
)userFormatDescription
(dc.format
)- BitstreamFormat, including short name, MIME type, extension. (
dc.format.medium
) - RW:Comment – Bitstream Technical Metadata
***** Why are we recording the file format support status? That's a DSpace property, rather than an Item property. Do DSpace instances rely on objects to tell them their support status? - Format support and other properties of the BitstreamFormat are recorded here in case the Item is restored in an empty DSpace that doesn't have that format yet, and the relevant bits of the format entry have to be reconstructed from the AIP. --lcs
- To see an example of the changes to the PREMIS version of this metadata, SEE link to RW Comments section page below
mets/fileSec
element - For archival objects of type ITEM:
- Each distinct Bundle in an Item goes into a
fileGrp
. Wiki Markup |
<span style="color: red">Did the Did the "ORIGINAL" bundle get renamed "CONTENT"?</span>
\
[Not in DSpace 1.4_ atUSE is set to the exact Bundle name in an AIP. --lcs\]
- Bitstreams in bundles become
file
elements under fileGrp
. file/@SEQ
contains the Bitstream sequence IDfile@CREATED
and file@SIZE
Wiki Markup |
<span style="color: red">The DSpace SIP calls for the use of <code>@CREATED</code> for the file element, AIP examples do not use <code>@CREATED</code>, but do use <code>@SIZE</code>, which is not recommended by SIP.</span>
\[Since Bitstreams don't have any dates (neither created nor last-modified) the at CREATED cannot be set on dissemination. --lcs\The DSpace SIP calls for the use of @CREATED
for the file element, AIP examples do not use @CREATED
, but do use @SIZE
, which is not recommended by SIP.
[Since Bitstreams don't have any dates (neither created nor last-modified) the at CREATED cannot be set on dissemination. --lcs]
mets/fileSec/fileGrp/file
element - Set @SIZE to length of the bitstream. There is a redundant value in the techMD but it is more accessible here.
- Set @MIMETYPE, @CHECKSUM, @CHECKSUMTYPE to corresponding bitstream values. There is redundant info in the techMD.
- SET @SEQ to bitstream's SequenceID if it has one.
- For archival objects of types COLLECTION and COMMUNITY:
- Only if the object has a logo bitstream, there is a
fileSec
with one fileGrp
child of @TYPE="LOGO"
. - The
fileGrp
contains one file
element, representing the logo Bitstream. It has the same file format, checksum, etc fields as the Item content bitstreams, but does not include metadata section references or a SequenceID. - See the main
structMap
for the reference to this file.
mets/structMap
- Primary structure map, @LABEL="DSpace Object", @TYPE="LOGICAL"
- For COLLECTION objects: Top-level
div
has one child: div
with @TYPE="MEMBERS"
. For every Item in the Collection, it contains a div
with an mptr
linking to the Handle of that Item. Its @LOCTYPE="HANDLE"
, and @xlink:href
value is the raw Handle.
...
- If Community has a Logo bitstream, there is an
fptr
reference to it in the very first div
.
- ITEM objects have the same kind of simple structure map as SIP/DIP: top level
div
with a div
under it for each visible Bitstream. - If Item has primary bitstream, put it in first
structMap/div/fptr
.
mets/structMap
- Structure Map to indicate object's Parent - Contains one
div
element which has the unique attribute value TYPE="AIP Parent Link"
to identify it as the older of the parent pointer. - It contains a
mptr
element whose xlink:href
attribute value is the raw Handle of the parent object, e.g. 1721.1/4321
.<p>In order to restore a DSpace archive from internal AIPs in the asset store, the parent of each object must be available at the surface level of the METS document so the object can be instantiated under its correct parent before the metadata (which may also name the parent) is crosswalked.
Rob Wolfe's Comments on METS Usage
Crosswalks
DIM Descriptive Elements for Collection objects
Panel |
---|
borderColor | #ccc |
---|
bgColor | #fff | title |
---|
borderStyle | dashed |
---|
title | |
---|
|
Metadata Field | getMetadata() key |
---|
dc.description | introductory_text | dc.description.abstract | short_description | dc.description.tableofcontents | side_bar_text | dc.identifier.uri | getHandle(); | dc.provenance | provenance_description | dc.rights | copyright_text | dc.rights_license | copyright_text | dc.title | name |
|
...
Panel |
---|
borderColor | #ccc |
---|
bgColor | #fff |
---|
title | borderStyle | dashed |
---|
title | |
---|
|
Metadata Field | getMetadata() key |
---|
dc.description | introductory_text | dc.description.abstract | short_description | dc.description.tableofcontents | side_bar_text | dc.identifier.uri | getHandle(); | dc.rights | copyright_text | dc.title | name |
|
...
Panel |
---|
borderColor | #ccc |
---|
bgColor | #fff |
---|
title | borderStyle | dashed |
---|
title | |
---|
|
Metadata Field | method and comments |
---|
dc.contributor | getSubmitter().getEmail() | dc.identifier.uri | getHandle() | dc.relation.isPartOf | getOwningCollection().getHandle() as URN | dc.relation.isReferencedBy | getCollections() Handle URN of each non-owner | dc.rights.accessRights | isWithdrawn() "WITHDRAWN" if true |
|
...
Panel |
---|
borderColor | #ccc |
---|
bgColor | #fff | title |
---|
borderStyle | dashed |
---|
title | |
---|
|
Metadata Field | method and comments |
---|
dc.title | getName() | dc.title.alternative | getSource() | dc.description | getDescription() | dc.format | getUserFormatDescription() | dc.format.medium | getFormat().getShortDescription() | dc.format.mimetype | getFormat().getMIMEType() | dc.format.supportlevel | getFormat().getSupportLevel() | dc.format.internal | getFormat().isInternal() |
|
...
Panel |
---|
borderColor | #ccc |
---|
bgColor | #fff |
---|
title | borderStyle | dashed |
---|
title | |
---|
|
Metadata Field | method and comments |
---|
dc.identifier.uri | getHandle() | dc.relation.isPartOf | getCommunities()[0] | Metadata Field | method and comments |
---|
dc.identifier.uri | getHandle() | <ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="cc49591c-7e73-4407-9539-6fa73546859e"><ac:plain-text-body><![CDATA[ | dc.relation.isPartOf | getCommunities()[0] | ]]></ac:plain-text-body></ac:structured-macro> | <ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="9cd9c915-968c-430b-b65b-84fb783675e0"><ac:plain-text-body><![CDATA[ | dc.relation.isReferencedBy | getCommunities()[1] ]]></ac:plain-text-body></ac:structured-macro> |
|
Panel |
---|
borderColor | #ccc |
---|
bgColor | #fff |
---|
title | borderStyle | dashed |
---|
title | |
---|
|
Metadata Field | method and comments |
---|
dc.identifier.uri | getHande() | dc.relation.isPartOf | getParentCommunity() |
|
...
These are examples of internal AIPs for some representative DSpace objects:
...