Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. It supports the creation and management of digital content objects (from this point on called digital objects) that can aggregate data from multiple sources. For example, a digital object might be a set of TIFF images that are the individual page images of a scanned document. The data sources may be either locally managed within the Fedora software or sourced from another URL accessible network server. The data sources may be content or metadata. You may think of these digital objects as advanced digital documents, especially in light of the feature described next.

  2. It supports the association of web services with the digital objects. These services typically consume the data packaged within the digital object to produce dynamic disseminations from the digital object. For example, the digital object described above with multiple TIFF page images may be associated with a service that OCRs the images that are components of the digital object and disseminates an HTML version of the pages. The services may be either local to the machine of the respective Fedora server or sourced from another network accessible server that is addressable via a URL. In this manner, Fedora acts as a mediation layer that coordinates local and distributed data and web services within a uniform framework. This is illustrated in Figure 1.

  3. It provides uniform access web-based interfaces to these digital objects, through REST requests and more powerful SOAP-based methods. These interfaces consist of a set of built-in methods to access characteristics common to all digital objects such as key metadata and internal structure. These include a method to introspect on an object to reveal the set of methods that constitute the extended behavior of that object. For example, a client could use these built-in methods to "learn" about the capability of the digital object described above to dynamically disseminate an HTML page from a set of TIFF images.
    The benefits of these are two-fold:

    1. Clients accessing Fedora digital objects can rely on uniform access regardless of the nature of the object.

    2. The disseminations available from an object are independent of the internal structure of the object. For example, the client interface of the example above in which HTML is disseminated from a set of source TIFF pages could remain constant regardless of whether the underlying object contained TIFF images, JPEG, PDF, or even simple static HTML. This gives the content developer great freedom to modify a repository's internals without disrupting the client and user views of the content.

  4. It presents a uniform and powerful REST and SOAP-based management interface. All internal operations of the repository such as object creation and management are available through these APIs, providing the hooks for integrating Fedora into a variety of environments. These makes Fedora useful as the foundation for advanced content management applications.

  5. It includes a comprehensive versioning framework that tracks the evolution of objects and provides access to earlier versions.

  6. It includes a basic relationship framework for representing the links among digital objects.

  7. It supports ingest and export of digital objects in a variety of XML formats. This enables interchange between Fedora and other XML-based applications and facilitates archiving tasks.

...

  1. Digital Object – This is the basic unit for information aggregation in Fedora. At a minimum a digital object has:
    1. An identifier or PID (Persistent Identifier). The PID provides the key by which the digital object is accessed from the repository.
    2. Dublin Core metadata that provides a basic description of the digital object.
  2. Datastream – A component of a digital object that represents a data source. A digital object may have just the basic Dublin Core Datastream, or any number of additional Datastreams. Each Datastream can be any MIME-typed data or metadata, and can either be content managed locally in the Fedora repository or by some external data source (and referenced by a URL). When you create a new Datastream in a digital object, you assign it to one of four types, or control groups, depending on the nature of the data that it represents.
    1. Managed Content (M): Datastream content is stored and managed within the Fedora repository's persistent storage. The content can be any MIME type including XML.
    2. Inline XML (X): A special case of M, restricted to well-formed XML. In this case, the Datastream content is stored as part of the XML structure of the digital object itself and is thus included when the digital object is exported (e.g., for archival purposes).
    3. Externally Referenced (E): Datastream content is external to the Fedora repository and is referenced by a URL that is recorded within the digital object. The content can be any MIME type including XML.
    4. Redirected Content (R): Like E, but Datastream content is delivered to the client without any mediation by Fedora; i.e., via an HTTP redirect. You should use this Datastream type when the external content is a web page with relative links or it is streaming audio or video. The content can be any MIME type including XML.

Decisions about what to include in a digital object and how to configure its Datastreams are basic modeling choices as you develop your repository. The examples in this tutorial demonstrate some common models that you may find useful as you develop your application.

...

  1. You will notice that the Control Group of the DC Datastream is Internal XML Metadata. As explained earlier, Fedora has a number of control group types, of which this is one. This type is appropriate for metadata that is represented in XML Dublin core Core metadata being one example. A digital object can have multiple metadata Datastreams, for example MARC, LOM, Dublin Core, and others.
  2. You can directly edit the Dublin Core metadata – e.g., add new Dublin Core fields – by selecting the Edit button and modifying the contents of the text pane. When you press Save Changes..., Fedora will check that the Datastream is well-formed XML.

You may also create Dublin Core metadata (or any other XML-based metadata) in an external XML editor and using the Import... button to replace the Datastream with this data. When you press Save Changes..., Fedora will check that the Datastream is well-formed XML.

You will notice that there are optional fields on the Datastreams pane for Format URI (to refine the media type meaning with a URI that more precisely identifies the media type) and Alternate Ids to capture any other existing identifiers you would like to associate with a Datastream. We will not be using these in this tutorial.

...

  • Rather than packing multiple metadata XML-based metadata formats in a digital object, it is possible to package a single base metadata format in a digital object (for example, fully qualified Dublin Core) and use that base format as the basis of metadata crosswalks. To do this, one could associate an XSLT engine (e.g. saxon Saxon) service with the digital object that processes the base format with a transform XSL document (packaged as a Datastream in another digital object) to derive one or more additional formats.
    In both cases, static and dynamic, disseminations are available via REST or SOAP requests from clients to the Fedora Access service (API-A and API-A-LITE). The nature of the disseminated content – the format of the underlying data, where it is located, and whether it is static or dynamically generated – is invisible from the client perspective. As a result, a repository manager can significantly alter the nature of a digital object and the web services that it uses while maintaining the same interface vis-à-vis the client. Correspondingly, two digital objects with entirely different structure can appear the "same" from the perspective of consuming clients.

...

The web service used in the example performs an XSLT XSL transform using the well-known saxon Saxon XSLT processor. This service requires two inputs, an XML source document and a XSL transform document. In this example, both of these XML documents are stored as managed content in a Fedora digital object. The XML source is data for a poem with tags for the structural elements of the poem (stanzas and lines). The XSL transform produces a HTML output of the poem that can be viewed in a browser. This example is borrowed from the web available source for Michael Kay's excellent XSLT book.

Ingesting Pre-defined SDef, SDep and CModel Objects

...

You now need to add the two Datastreams: the XML source document and the XSL transform document. Using the same method described in Example 1, select the Datastreams tab and:

...

Example 4 – Modifying Example 3 Using a Redirect Datastream

Example 3 packages the XSL transform Datastream in the same digital object as the source XML Datastream. However, in many cases you will have XSL transform code that you want to share across several XML sources. This section modifies Example 3 to enable this sharing.

This is done by packaging the XSL transform code in a digital object of its own. Then every digital object that needs to make use of the XSL transform code can use the Fedora REST URL to access that Datastream. This is done by defining a redirect Datastream using the REST URL as the redirect target. Then, the same disseminator design used in Example 3 can be reused. This is known as dissemination chaining, whereby the dissemination of one digital object is used by another.

...

  • Create a new digital object (the "XSL" digital object) assigning the PID demo:400. Create one Datastream in addition to the DC with ID XSL. As before, this Datastream should be configured as:
    • ID – xsl
    • Control Group – Managed Content
    • Mime type – text/xml
    • Label - Poem XSL Transform
    • Import location: FEDORA_HOME/userdocs/tutorials/2/example3/poem.xsl
  • Create another digital object (the "disseminator" digital object) assigning the PID demo:500.
  • Create two new Datastreams
    • One configured as follows (the same as the Source Datastream in Example 3):
      1. ID – source
      2. Control Group – Managed Content
      3. MIME type – text/xml
      4. Label - Poem XML Source
      5. Import location: FEDORA_HOME/userdocs/tutorials/2/example3/poem.xml
    • Now create the Datastream that will redirect to the XSL in demo:400 as follows:
      1. ID – xsl
      2. Control Group – Redirect
      3. Mime Type – text/xml
      4. Label - Poem XSL Transform
      5. location: http://localhost:8080/fedora/get/demo:400/XSL
      6. • On the New RELS-EXT... tab add the same hasContentModel relationship to demo:ex3CModel as you did in example 3.

...