Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3
Table of Contents
outlinetrue
stylenone

Relationships Between Bitstreams

...

There are already relationships between Bitstreams (in DSpace 1.3 and 1.4) but they are not recorded in any formal or obvious way. For example, some of the MediaFilter classes add a derivative Bitstream for each of the "content" Bitstreams, such as a thumbnail image. To associate the thumbnail image with its full-size counterpart, the thumbnail Bitstream is given the same name as its master image with an extra image file extension tacked onto the end. So when the Web UI has the opportunity to display a thumbnail image of a Bitstream, it must know to look for a Bitstream with an extra
".jpg" appended to its name.
This ad hoc solution is not scalable and it violates modularity. The media filter plugin and the Web UI have to share a secret: the formula for naming thumbnail Bitstreams. For another application, e.g. a METS package disseminator, to associate thumbnail images with the original versions, it has to be let in on the secret.

...

There is also the limitation that Bitstream names are actually metadata, and they need not be unique. It is perfectly legal for an Item to have two image Bitstreams named
"illustration.gif", but the thumbnail naming convention breaks down.
However, my objection is not so much to the method (magic Bitstream
names), as to the lack of a formal API to enforce the rules and make the
Bitstream-association mechanism available to all other modules.

The generalization of this mechanism also helps
several projects which need to associate metadata with an individual
Bitstream, as opposed to the entire Item. The
JHOVE integration will need a place to put the
preservation metadata it generates for each content Bitstream. The
CWSpace project has descriptive metadata at the
Bitstream level for "learning objects" within Items. It goes on.

...

What are the reasons for associating one Bitstream with another?
Here are the ones I've thought of; perhaps you can name others:

...

The first two kinds of relationships are not symmetric: the "derived"_and _"is metadata about" relationships have clear master/slave roles. The
alternate relationship can be considered symmetric. Note: If one of the
"alternate" formats was created by a preservation operation, that fact
should be recorded in provenance metadata, it does not need to be in the
object model as master/slave roles.

One Bitstream may be the target of multiple relationships – e.g. an
image might have a thumbnail, "full text", and metadata associated with it.
I do not forsee a need for one "subsidiary" Bitstream to express a relationship
to more than one master, however.

Bundles take no part in expressing Bitstream relationships. Although a
Bundle indicates the purpose of its Bitstreams like thumbnails,
derived text, and metadata, that fact is orthogonal to the mapping of

a subsidiary Bitstream to its master. For example, an index builder looking
for derived text files would enumerate the Bitstreams in the

...

CONTENT

...

(or

...

ORIGINAL

...

) Bundle, checking each one for a derived Bitstream
in the

...

TEXT

...

Bundle.

Proposed Solution

Add the following API calls to the

...

Bitstream

...

object:

Code Block
    /** Constants describing the type of relationship: */
 '' ''
    // This Bitstream was derived from the contents of the related one.
    public static final int REL_TYPE_DERIVED     = 1;
 '' ''
    // This Bitstream is an alternate version of the related one.
    public static final int REL_TYPE_ALTERNATE   = 2;
 '' ''
    // This Bitstream contains descriptive metadata about the related one.
    public static final int REL_TYPE_MD_DESC     = 3;
 '' ''
    // This Bitstream contains administrative metadata about the related one.
    public static final int REL_TYPE_MD_ADM      = 4;
 '' ''
    // This Bitstream contains technical metadata about the related one.
    public static final int REL_TYPE_MD_TECH     = 5;
 '' ''
    // This Bitstream contains provenance metadata about the related one.
    public static final int REL_TYPE_MD_PROV     = 6;
 '' ''
    // This Bitstream contains rights metadata about the related one.
    public static final int REL_TYPE_MD_RIGHTS   = 7;
 '' ''
    /**
     * Establish a relationship with the target Bitstream.  The nature
     * of that relationship is described by setRelationshipType().
     * When relevant, the object executing this method is the "slave"
     * (e.g. the derived one) in the relationship.
     * @param related Bitstream object which is target (master) of relationship.
     */
    public void setRelatedBitstream(Bitstream related)
 '' ''
    /**
     * Get related Bitstream.
     * @return related (master) Bitstream, or null if none has been set.
     */
    public Bitstream getRelatedBitstream()
 '' ''
    /**
     * Set the type of relationship this Bitstream has with the
     * one set in setRelatedBitstream().  See the REL_TYPE_*
     * constants for details.
     * @param rel indicates type of relationship.
     */
    public void setRelationshipType(int rel)
 '' ''
    /**
     * @returns type of relationship, or 0 if none has been set.
     */
    public int getRelationshipType()
 '' ''
    /**
     * Get all the Bitstreams with relationships to this one.
     * @return Array related Bitstreams, or empty Array if there are none.
     */
    public Bitstream[] getBitstreamsRelatedToSelf()
 '' ''
 

   /**
     * Get all the Bitstreams with relationships to this one, and
     * a Bundle name matching the indicated one.
     * @param bundleName bundle name to match.
     * @return Array related Bitstreams, or empty Array if there are none.
     */
    public Bitstream[] getBitstreamsRelatedToSelf(String bundleName)

For example, if you have a "content" Bitstream and want to find out if
it has a thumbnail image, you would call

...

getBitstreamsRelatedToSelf("THUMBNAIL")

...

, and check if the array
returned has any elements.

Looking for LOM metadata is a more involved application: First you'd
have to get any related Bitstreams in the

...

METADATA

...

bundle, and then
search through those for one of the appropriate type (perhaps indicated by
its

...

BitstreamFormat

...

).

Establishing a relationship is easy. In code that creates a Bitstream
by deriving it from another Bitstream, like a media filter plugin,
it's obvious: set the new Bitstream to be related from the one from
which it was derived, and

...

setRelationshipType(REL_TYPE_DERIVED).

When ingesting a package (SIP) that includes an encoding of relationships
between Bitstreams (e.g. METS), the ingester code should also
make the appropriate calls to

...

setRelatedBitstream()

and

...

setRelationshipType()

...

to mark metadata, derived, and alternate Bitstreams.

The implementation is straightforward. It can be done by adding two
columns to the Bitstream table; one containing a

...

bitstream_id

...

key
(or

...

NULL

...

) to the related Bitstream, and another with an integer for
the relationship type.

...

I think we need set relationships analogous to the METS fileGrp or div elements. Right now DSpace has the concept of bundles, but those bundles are used to separate derived bitstreams from original/content bitstreams. This means the user has no way of selecting what bundle to place a bitstream in, nor can they create bundles. Everything the user adds to an item is placed in the

...

original/content

...

bundle.

If container relationships are added at the bitstream level there are several possible uses. The first use that comes to mind is relating scanned pages together into chapters, sections, or other groupings. Another possibility is relating together different types of pictures for one physical artifact. There are various use cases, and most depend on the type of artifact being stored.

...

  • Follow Jim's suggestion about Code Block N-N relationships
  • Make these relationships interact with bundles; this would allow administrative and technical metadata to be applied to a bundle of bitstreams.
  • Make bundles recursive, so that bundles can contain other bundles.
  • Enable the user to create and modify bundles.

...