...
Figure 1: Fedora 4 data model for ResData
...
Classes
The ResData Fedora 4 data model is an adaptation of the PCDM model, integrated with a customised version of ANDS VITRO ontology. The resultant ontology consists mainly of the following classes:
Activities, Datasets, Parties (pcdm:Collection)
Activities, Datasets, and Parties are Fedora 4 container nodes of pcdm:Collection type, mainly intended to enable grouping of the three main ResData resource types, i.e. Activity, Dataset and Party. Fedora 4 URI structures for these pcdm:Collection containers are listed below:
Container name | URL |
Activities | /rest/activities |
Datasets | /rest/datasets |
Parties | /rest/parties |
Dataset (VITRO-ANDS:ResearchData, pcdm:Object)
The ResearchData class from the ANDS VITRO ontology is used to define the Dataset resource type in ResData. In the Fedora 4 model for ResData, all instances of the ResearchData class are also defined as nodes of pcdm:Object type with a number of data properties containing descriptive metadata, and object properties containing reference to other related ResData resources, such as Activity (vivo:ResearchActivity), Party (foaf:Person) and other Dataset resources. Figure 2 bellow illustrates the combined use of pcdm:Object and VITRO-ANDS:ResearchData classes to represent various ResData resource types.
Figure 2: ResData Dataset resource defined as pcdm:Object
Fedora 4 URI structures for ResData Dataset-related nodes are as below:
Description | URL |
Dataset | /rest/datasets/[dataset pairtree id] |
Access | /rest/datasets/[dataset pairtree id]/access |
Licence | /rest/datasets/[dataset pairtree id]/licence |
Methodology | /rest/datasets/[dataset pairtree id]/methodology |
Time Period | /rest/datasets/[dataset pairtree id]/timePeriod |
Retention Period | /rest/datasets/[dataset pairtree id]/retentionPeriod |
Subject | /rest/datasets/[dataset pairtree id]/subject |
Publication | /rest/datasets/[dataset pairtree id]/publication |
GEO | /rest/datasets/[dataset pairtree id]/geo |
Rights | /rest/datasets/[dataset pairtree id]/rights |
Storage | /rest/datasets/[dataset pairtree id]/storage |
ms21:PartyRelation
PartyRelation is a custom class for describing a user-specified relation between a Party and a Dataset. Instances of PartyRelation in the ResData Fedora 4 model are also defined as pcdm:Object type nodes.
Fedora 4 URI structures for the PartyRelation nodes are:
Description | URL |
Dataset | /rest/datasets/[dataset pairtree id] |
PartyRelation | /rest/datasets/[dataset pairtree id]/partyRelation1 |
ms21:ResourceRelation
ResourceRelation is a custom class for describing user-defined relationships between Dataset resources. Instances of ResourceRelation in the ResData Fedora 4 model are also defined as pcdm:Object type nodes.
Fedora 4 URI structures for the ResourceRelation nodes are:
Description | URL |
Dataset | /rest/datasets/[dataset pairtree id] |
ResourceRelation | /rest/datasets/[dataset pairtree id]/resourceRelation1 |
Activity (vivo:ResearchActivity, pcdm:Object)
The ResearchActivity class from the VIVO ontology is used to define Activity type resources in ResData. In the Fedora 4 model for ResData, all instances of the ResearchActivity class are also defined as nodes of pcdm:Object type with a number of data properties containing descriptive metadata and object properties containing reference to additional information about a research project, including funding body and affiliation. Figure 3 bellow illustrates how pcdm:Object and vivo:ResearchActivity classes are combined to represent Activity-type resources in ResData Fedora 4 model.
Figure 3: Activity-type resources in Fedora 4 model for ResData
Fedora 4 URI patterns for ResData Activity-type resources are:
Description | URL |
Activity | /rest/activities/[activity pairtree id] |
Funding | /rest/activities/[activity pairtree id]/funding |
Organisation | /rest/activities/[activity pairtree id]/organisation |
Party (foaf:Person, pcdm:Object)
Similar to Dataset and Activity, all Party-type resources are defined as instances of both the Person class from the FOAF ontology and the pcdm:Object class (Figure 4).
Figure 4: ResData Party defined as pcdm:Object
Fedora 4 URI patterns for ResData Party-type resources:
Description | URL |
Activity | /rest/parties/[party pairtree id] |
Funding | /rest/parties/[party pairtree id]/organisation |
ResData
Namespaces
Namespace | URL |
bibo | |
owl | |
ms21 | http://www.unsworks.unsw.edu.au/ontology/preservation-metadata/ |
VITRO-ANDS | |
core | |
foaf | |
pcdm |
...
Note: All classes are derived from existing classes used on Fedora 3 used in RELS-INT and RELS-EXT
Classes
unsworksp:collection
Collection is a class describing a group of records. Aside from descriptive metadata, it contains administrative metadata containing access information to the records belonging to the collection.
Property | Sub-property of | Range | Note |
unsworksp:hasCollection |
| unsworksp:collection |
unsworksp:record
A record class individual represents an intellectual entity such as a thesis, a book, moving image, etc. It has descriptive metadata in Dublin Core and administrative metadata. it can have a link to other individual such as metadata, rights, and resource.
Property | Sub-property of | Range | Note |
unsworksp:hasMetadata |
| unsworksp:metadata | |
unsworksp:hasRights |
| unsworksp:rights | |
unsworksp:hasResource |
| unsworks:resource |
unsworksp:resource
A resource class individual represents the electronic resource of the record such as a PDF file of a thesis. It is stored as binary data and it can link to another resource describing the record has another binary data in another format type for preservation purpose. For example: a thesis record has binary file in word document and there is another binary file in PDF format which is converted from the word document.
Property | Sub-property of | Range | Note |
unsworksp:migratedFrom |
|
|
unsworksp:metadata
Metadata class is a class describing a metadata of a record. It is used to represent other record metadata not in Dublin Core format which will be stored as binary data. Similar to resource, it can link to same type another metadata for preservation purpose
Property | Sub-property of | Range | Note |
unsworksp:migratedFrom |
...
unsworkspunsworksp:rights
Rights class individual represent a licence or agreements that author of the electronic resource has signed. Similar to resource, it can link to same type another metadata for preservation purpose
Property | Sub-property of | Range | Note |
unsworksp:migratedFrom |
|
|
Descriptive and Administrative Metadata
...
Below is the RELS-INT and RELS-EXT information that will be ported to Fedora 4 as part of Resource property:
Property | Sub-property of | RangeNote | |
---|---|---|---|
unsworksp:resourceType |
|
|
|
unsworksp:dunsworkspid |
|
|
|
unsworks:embargodate |
|
|
|
unsworks:embargoRemoved |
|
|
|
owl:SameAs | Alternate URL |
...
For descriptive metadata, the format for each of Fedora 4 resource is a Dublin Core metadata format.
...
Storage: Legacy storage (or Akubra)
XML metadata : datastreams
XML metadata : inline
The inline XML metadata is a metadata of the resource. It is mapped as property of a fedora:container.
See Data Model
Content models
Fedora 4 REST API will be used to Fedora 3 to Fedora 4. There are no issues related to the storage type for migration. The only difference is that container node is stored in database. On Fedora 3, object and datastream are stored in file structure.
XML metadata : datastreams
Where possible, metadata will be stored as properties of the relevant node. Metadata in other formats such as XML (e.g. MODS), will be stored as a binary file (pcdm:File).
XML metadata : inline
The inline XML metadata is a descriptive metadata of the resource. It is mapped as property of Fedora 4 container node (pdcm:Object).
See Data Model above for more information.
Content models
The default Fedora content models have not been modified.
Datastream types (inline, managed, redirect, and external)
Identifiers
The PairTree algorithm is the default method for generating identifiers in Fedora 4 default. This method will be used for the migration and a new object to address the performance issue about limiting the number of children under a single resource (Performance). As for the legacy PID, it will be stored as a property of the node as mentioned above. For example, refer to the URL structures on Data Model section.
Indexing strategies (GSearch, RI-Search vs. F4 approaches)
Integrate Fedora 4 with external triple store using JMS Message Consumer to accommodate search with SPARQL.
Replication/Journaling
N/A
Security policies: XACML
OAI-PMH
Versions
Disseminators
Audit history
API
In Fedora 3, the UNSWorks and ResData repositories only uses inline and managed datastreams. Inline datastreams is used for descriptive metadata such as DC, MODS, and MARCXML . DC metadata can be mapped to properties of Fedora 4 container node, others will be stored as binary file as Fedora 4 binary node. Similarly for managed datastreams, all will be stored as Fedora binary node (pdcm:File). See the UNSWorks and ResData Data Models for more information.
Identifiers
The PairTree algorithm is the default method for generating identifiers in Fedora 4 default. This method will be used for the migration and a new object to address the performance issue about limiting the number of children under a single resource (Performance). As for the legacy PID, it will be stored as a property of the node as mentioned above. For example, refer to the URL structures on Data Model section.
Indexing strategies (GSearch, RI-Search vs. F4 approaches)
Integrate Fedora 4 with external triple store using JMS Message Consumer to accommodate search with SPARQL.
For installation, refer to:
https://wiki.duraspace.org/display/FEDORA41/External+Triplestore
Replication/Journaling
N/A
Security policies: XACML
Security policies will be initially handled by the client applications. WebACL and the Fedora 4 Access Roles module will be explored further in future.
OAI-PMH
Fedora 4 OAI-PMH Provider will be used. Refer to the information on this link for installation:
https://wiki.duraspace.org/display/FEDORA41/Setup+OAI-PMH+Provider
Further testing will be done to test for OAI-PMH status.
Versions
Fedora 4 versioning will be used to store Fedora 3 versions. This will be included on the migration script later.
Disseminators
N/A
Audit history
For migration purposes, the legacy Fedora 3 FOXML will be stored as fedora:Binary (pcdm:File) in Fedora 4. The Fedora 4 Audit module will be used to manage the audit history after further testing.
API
Fedora 4 REST API will be used to replaced Fedora 3 SOAP and REST API