Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The

...

REST

...

API,

...

Fedora

...

resource

...

identifiers,

...

The

...

Resource

...

index,

...

Named

...

Graphs

...

and

...

the

...

semantic

...

web

...

Note:

...

The

...

scope

...

of

...

this

...

document

...

goes

...

some

...

way

...

beyond

...

the

...

London

...

Committers'

...

meeting

...

agenda

...

item

...

-

...

Interfaces.

...

Comments

...

welcomed

...

on

...

which

...

are

...

the

...

most

...

relevant

...

issues

...

to

...

cover

...

in

...

the

...

meeting.

...

Fedora in the context of the Semantic web and Linked Data

The basic premise of the proposals below is to support exposing Fedora resources and their relationships in a Semantic Web and Linked Data friendly way. It attempts to "unify" the identifiers and relationships used for Fedora resources with the new REST API and the resource index.

To be Semantic Web and Linked Data friendly involves

  • publishing dereferencable http URIs for resources
  • publishing of relationships between resources using these identifiers

The new REST API is a move forward in supporting these requirements as we now have dereferencable http URIs for Fedora resources.

What this proposal is not about

  • Implementation of a semweb publishing mechanism
    • "What" a Fedora object/datastream URI identifies depends on how an individual repository is built
      • Compound vs Complex objects
      • Aggregations, collections
  • Not an implementation of OAI-ORE
  • But it provides mechanisms to support these

Current situation

  • Identifiers
    • Fedora resources have identifiers such as namespace:pid and namespace:pid/datastream, and their info:fedora/ URI forms (and similarly for disseminations)
      • These identifiers are effectively scoped to a repository installation
    • The new REST API provides globally dereferencable http URIs for resources, but these are not "defined" as (canonical) identifiers for resources.
    • The existing "LITE" APIs also provide resolvable URI resource identifiers
  • Relationships
    • The resource index is a single "graph" containing relationships for all objects
    • Relationships must have either a Fedora object or datastream as the subject
      • Limits metadata expression to "flat" schemes such as DC
    • No support for "arbitrary" RDF datastreams in the resource index (eg for implementing additional RDF metadata schemes)
    • Resource identifiers used in relationships are of the info:fedora/ form
      • Difficult to "interpret" relationships outside of the scope of the repository
    • The "specification" of what relationships exist for an object is defined in imperative code

Some Principles

Some basic principles that should be followed by the recommendations below:

  • The definitive source of all information should be in Fedora objects
    • No direct manipulation of the triple store - the triple store is a cache/index
  • The info:fedora URI scheme should be retained
    • supports re-use of objects across different repositories
  • As far as possible Fedora should "work" without using a resource index.

Proposals

Deprecate the "LITE" APIs.

  • Implement by using HTTP status code 301: moved permanently in the next release
  • Remove in subsequent release

Define canonical dereferenceable URIs for Fedora resources

  • Using the new REST API URIs

Restructure the resource index as named graphs

A named graph is a set of triples named by a URI.

For instance, the relationships contained in the object myns:somepid could be identified as the graph <#myns:somepid>. Similarly the relationships expressed by the datastream myns:somepid/RELS-EXT could be identified as <#myns:somepid/RELS-EXT>.

Triple query languages such as SPARQL and iTQL support queries across multiple graphs. Using this to query relationships over the repository as a whole would be complex - it would be painful to have to assemble a list of named graphs to query against.

Mulgara is a quad store, relationships are effectively stored as <graph> <subject> <predicate> <object>. Currently all triples are stored in a single <#ri> graph.

Mulgara supports creating models (graphs) that are views of other models (graphs), eg

  • <#myns:somepid/RELS-EXT> containing RELS-EXT relationships
  • <#myns:somepid/DC> containing Dublin Core relationships
  • <#myns:somepid/properties> containing relationships created from object properties
  • <#myns:somepid> as a view defined as the union of the above - this view "contains" all of the relationships for the object
  • A view could be created for all relationships in the repository, as a union of all individual object views. This would be equivalent to the current <#ri> graph.

Thus, a hierarchy of named graphs could be created, for example:

  • <#ri> - a view containing:
    • <#some:pid> - object graph for some:pid, a view containing:
      • <#some:pid/properties> - graph containing object properties
      • <#some:pid/datastreams> - a view containing:
        • <#some:pid/datastreams/rels-ext>

...

        • -

...

        • graph

...

        • containing

...

        • triples

...

        • from

...

        • rels-ext

...

        • <#some:pid/datastreams/rels-int>

...

        • -

...

        • graph

...

        • containing

...

        • triples

...

        • from

...

        • rels-int

...

        • <#some:pid/datastreams/dc>

...

        • -

...

        • graph

...

        • containing

...

        • triples

...

        • from

...

        • DC
        • <#some:pid/datastreams/

...

        • {rdf

...

        • datastream

...

        • }>

...

        • -

...

        • graph

...

        • containing

...

        • triples

...

        • from

...

        • some

...

        • other

...

        • rdf

...

        • datastream
        • <#some:pid/datastreams/

...

        • {dsid

...

        • }/properties>

...

        • -

...

        • graph

...

        • containing

...

        • properties

...

        • of

...

        • datastream

...

        • {dsid}

...

        • (state,

...

        • last

...

        • modified,

...

        • etc)

...

    • <#some:otherpid>

...

    • -

...

    • object

...

    • graph

...

    • for

...

    • some:otherpid,

...

    • a

...

    • view

...

    • containing:

...

      • <#some:otherpid/properties>

...

      • -

...

      • etc

...

      • <#some:otherpid/datastreams>

...

      • -

...

      • etc

...

All

...

graphs

...

should

...

be

...

"rooted"

...

in

...

the

...

above

...

structure,

...

there

...

should

...

be

...

no

...

means

...

of

...

creating

...

graphs

...

other

...

than

...

by

...

creating

...

objects

...

and datastreams.

Why do this?

When an object is created (updated,

...

deleted),

...

the

...

object's

...

relationships

...

are

...

propagated

...

to

...

the

...

triple

...

store.

...

If

...

two

...

objects

...

are

...

created

...

expressing

...

an

...

identical

...

relationship,

...

a

...

single

...

triple

...

will

...

be

...

created

...

in

...

the

...

resource

...

index..

...

If

...

one

...

of

...

those

...

objects

...

is

...

then

...

deleted,

...

the

...

triple

...

will

...

be

...

deleted

...

from

...

the

...

triple

...

store

...

even

...

though

...

it

...

is

...

still

...

being

...

asserted

...

by

...

another

...

object.

...

The

...

resource

...

index

...

will

...

not

...

be

...

an

...

accurate

...

reflection

...

of

...

the

...

triples

...

in

...

the

...

repository.

...

Hence

...

the

...

current

...

restrictions

...

on

...

RELS-EXT

...

and

...

RELS-INT

...

that

...

subjects

...

must

...

be

...

the

...

Fedora

...

object

...

or

...

datastreams

...

from

...

the

...

containing

...

object,

...

to

...

prevent

...

two

...

objects

...

asserting

...

the

...

same

...

relationship.

...

With

...

named

...

graphs,

...

relationships

...

created

...

by

...

different

...

objects

...

would

...

be

...

in

...

different

...

graphs.

...

Deleting

...

one

...

object

...

would

...

remove

...

the

...

graph

...

for

...

that

...

object

...

-

...

but

...

the

...

graph

...

for

...

a

...

different

...

object

...

asserting

...

the

...

same

...

relationship

...

would

...

remain

...

-

...

the

...

resource

...

index

...

would

...

be

...

an

...

accurate

...

reflection

...

of

...

the

...

triples

...

in

...

the

...

repository.

...

Therefore

...

this

...

would

...

support

...

indexing

...

of

...

arbitrary

...

RDF

...

metadata

...

datastreams

...

in

...

the

...

resource

...

index

...

-

...

for

...

instance

...

supporting

...

metadata

...

schemes

...

that

...

are

...

not

...

"flat"

...

Questions

...

and

...

issues

...

  • The

...

  • graph

...

  • hierarchy

...

  • to

...

  • use

...

  • -

...

  • how

...

  • granular?

...

  • Start

...

  • with

...

  • something

...

  • simple?

...

  • Mapping

...

  • between

...

  • resource

...

  • identifiers

...

  • and

...

  • graph

...

  • names

...

  • Separation

...

  • of

...

  • "core"

...

  • relationships

...

  • from

...

  • "user-defined"

...

  • relationships

...

  • into

...

  • different

...

  • overall

...

  • views?

...

  • If

...

  • the

...

  • intention

...

  • of

...

  • the

...

  • resource

...

  • index

...

  • is

...

  • to

...

  • store

...

  • relationships

...

  • between

...

  • objects,

...

  • we

...

  • may

...

  • not

...

  • want

...

  • to

...

  • pollute

...

  • that

...

  • with

...

  • other

...

  • relationships,

...

  • eg

...

  • from

...

  • arbitrary

...

  • RDF

...

  • datastreams

...

    • Relationships

...

    • about

...

    • the

...

    • object

...

    • and

...

    • its

...

    • datastreams

...

    • -

...

    • in

...

    • <#ri>

...

    • Relationships

...

    • from

...

    • RELS-EXT,

...

    • RELS-INT,

...

    • DC

...

    • -

...

    • in

...

    • <#ri>

...

    • Relationships

...

    • from

...

    • arbitrary

...

    • RDF

...

    • datastreams/disseminators

...

    • -

...

    • in

...

    • <#riUser>

...

    • <#riFull>

...

    • as

...

    • a

...

    • union

...

    • of

...

    • <#ri>

...

    • and

...

    • <#riUser>

...

  • Performance.

...

  • Need

...

  • to

...

  • evaluate

...

  • query

...

  • performance

...

  • over

...

  • a

...

  • network

...

  • of

...

  • named

...

  • graphs

...

  • vs

...

  • storing

...

  • all

...

  • relationships

...

  • in

...

  • one

...

  • single

...

  • graph

...

  • Triple

...

  • store

...

  • support:

...

  • Mulgara

...

  • supports

...

  • named

...

  • graphs

...

  • and

...

  • views,

...

  • what

...

  • about

...

  • other

...

  • triple

...

  • stores?

...

  • MPTStore?

...

  • Impact

...

  • on

...

  • Mulgara's

...

  • free-text

...

  • index,

...

  • do

...

  • we

...

  • create

...

  • a

...

  • parallel

...

  • structure

...

  • of

...

  • free

...

  • text

...

  • graphs?

...

  • Does

...

  • Mulgara

...

  • even

...

  • support

...

  • this?

...

Declarative

...

specification

...

of

...

triples

...

to

...

create

...

in

...

the

...

resource

...

index

...

Triples

...

are

...

currently

...

created

...

for

...

  • object

...

  • properties

...

  • datastream

...

  • properties

...

  • reserved

...

  • datastreams

...

  • that

...

  • contain

...

  • RDF

...

  • (RELS-*)

...

  • reserved

...

  • datastreams

...

  • are

...

  • translated

...

  • to

...

  • RDF

...

  • (DC)

...

  • relationships

...

  • between

...

  • objects

...

  • and

...

  • their

...

  • datastreams

...

  • and

...

  • disseminators

...

  • relationships

...

  • between

...

  • objects

...

  • and

...

  • their

...

  • content

...

  • models

...

The

...

"specification"

...

of

...

what

...

triples

...

get

...

generated

...

is

...

largely

...

in

...

imperative

...

Java

...

code,

...

both

...

in

...

terms

...

of

...

the

...

individual

...

triples

...

and

...

which

...

datastreams

...

generate

...

triples.

...

In

...

the

...

future

...

we

...

may

...

wish

...

to

...

allow

...

creation

...

of

...

triples

...

from

...

  • arbitrary

...

  • RDF

...

  • datastreams

...

  • arbitrary

...

  • XML

...

  • datastreams

...

  • to

...

  • be

...

  • "lifted"

...

  • to

...

  • triples

...

  • disseminators

...

  • serving RDF

Wiki Markup
To support a flexible and extensible approach, we could define the generation of triples using content models (system and user) and a declarative approach for specifying triples (XSLT, GRDDL\[1\]).

...

  • System

...

  • content

...

  • model

...

  • disseminators

...

  • for

...

  • generating

...

  • RDF

...

  • for

...

    • Object

...

    • and

...

    • datastream

...

    • properties

...

    • triples

...

    • (from

...

    • the

...

    • object's

...

    • serialisation/FOXML)

...

    • Relationships

...

    • between

...

    • objects,

...

    • datastreams

...

    • and

...

    • disseminators

...

    • (from

...

    • the

...

    • object's

...

    • serialisation/FOXML)

...

    • XML

...

    • datastreams

...

    • (DC)

...

  • User

...

  • content

...

  • models

...

  • specifying

...

    • Additional

...

    • arbitrary

...

    • RDF

...

    • datastreams

...

    • to

...

    • index

...

    • RDF

...

    • disseminators

...

    • to

...

    • index

...

    • Conversion

...

    • patterns

...

    • for

...

    • other

...

    • XML

...

    • datastreams

...

    • and

...

    • disseminators

...

Updating

...

of

...

the

...

resource

...

index

...

could

...

then

...

take

...

place

...

by

...

querying

...

the

...

disseminations

...

and

...

datastreams

...

specified

...

by

...

the

...

system

...

and

...

user

...

content

...

models

...

when

...

an

...

object

...

is

...

created,

...

updated

...

or

...

deleted.

...

Wiki Markup
\[1\] GRDDL is a mechanism for Gleaning Resource Descriptions from Dialects of Languages. It is a technique for obtaining RDF data from XML documents and in particular XHTML pages: GRDDL Primer [http://www.w3.org/TR/grddl-primer/|http://www.w3.org/TR/grddl-primer/]

...

Questions

...

and

...

Issues

...

  • How

...

  • to

...

  • define

...

  • the

...

  • above

...

  • using

...

  • system

...

  • and

...

  • user

...

  • content

...

  • models

...

  • How

...

  • to

...

  • specify

...

  • the

...

  • mapping

...

  • between

...

  • XML

...

  • and

...

  • RDF

...

  • Extend

...

  • the

...

  • REST

...

  • API

...

  • to

...

  • incorporate

...

  • relationships

...

  • The

...

  • REST

...

  • API

...

  • does

...

  • not

...

  • currently

...

  • implement

...

  • methods

...

  • for

...

  • disseminating

...

  • and

...

  • managing

...

  • relationships.

...

API

...

methods

...

should

...

be

...

implemented

...

for

...

querying

...

and

...

managing

...

relationships.

...

For

...

example

...

  • GET

...

  • /objects/{pid}/relationships

...

  • -

...

  • return

...

  • RDF

...

  • for

...

  • all

...

  • relationships

...

  • for

...

  • the

...

  • object

...

  • GET

...

  • /objects/{pid}/datastreams/DC/relationships

...

  • -

...

  • return

...

  • RDF

...

  • for

...

  • the

...

  • DC

...

  • datastream

...

Alternatives

...

to

...

explicit

...

"relationships"

...

URIs

...

could

...

be

...

  • Use

...

  • content

...

  • negotiation,

...

  • eg

...

  • Accept

...

  • application/rdf+xml

...

  • and

...

  • use

...

  • the

...

  • existing

...

  • REST

...

  • URIs

...

  • Use

...

  • a

...

  • "format"

...

  • URL

...

  • query

...

  • string,

...

  • eg

...

  • format=rdf

...

  • Or

...

  • both...

...

Modifications

...

could

...

be

...

specified

...

by

...

  • POST

...

  • a

...

  • set

...

  • of

...

  • triples

...

  • to

...

  • create

...

  • new

...

  • ones

...

  • DELETE

...

  • a

...

  • set

...

  • of

...

  • triples

...

  • to

...

  • be

...

  • deleted
  • Wiki Markup
    PUT a set of modifications to perform, eg using (a subset of) SPARQL Update \[1\]

...

Additionally,

...

or

...

alternatively,

...

"writeable

...

disseminators"

...

could

...

be

...

provided

...

as

...

a

...

generic

...

mechanism

...

to

...

implement

...

this,

...

eg

...

PUT

...

a

...

SPARQL

...

Update

...

to

...

/objects/

...

{pid

...

}/methods/

...

{sDefPid

...

}/relationships?datastream=RELS-EXT

...

All

...

of

...

the

...

relationship

...

API

...

methods

...

should

...

operate

...

directly

...

on

...

Fedora

...

objects

...

to

...

remove

...

dependency

...

on

...

the

...

resource

...

index

...

-

...

relationship

...

GET

...

methods

...

should

...

query

...

the

...

object

...

directly

...

rather

...

than

...

issuing

...

RI

...

queries.

...

Wiki Markup
\[1\] SPARQL Update - A language for updating RDF graphs: [http://www.w3.org/Submission/SPARQL-Update/|http://www.w3.org/Submission/SPARQL-Update/]

...

Questions

...

and

...

issues

...

  • REST

...

  • endpoints

...

  • to

...

  • use

...

  • -

...

  • explicit

...

  • relationships

...

  • URIs

...

  • vs

...

  • content

...

  • negotiation

...

  • vs

...

  • URL

...

  • query

...

  • string

...

  • Relationships

...

  • update

...

  • specification

...

  • (SPARQL

...

  • Update,

...

  • or

...

  • ...)

...

  • Supporting

...

  • "generic"

...

  • updates,

...

  • eg

...

  • repository-wide

...

  • relationships

...

  • methods

...

  • and

...

  • methods

...

  • operating

...

  • on

...

  • an

...

  • object

...

  • as

...

  • a

...

  • whole

...

    • Subject

...

    • and

...

    • predicate

...

    • can

...

    • be

...

    • used

...

    • to

...

    • determine

...

    • what

...

    • to

...

    • update

...

    • for

...

    • object

...

    • properties,

...

    • datastream

...

    • properties,

...

    • Dublin

...

    • Core

...

    • RELS-EXT,

...

    • RELS-INT

...

    • and

...

    • arbitrary

...

    • datastreams

...

    • present

...

    • a

...

    • challenge.

...

    • A

...

    • triple

...

    • with

...

    • a

...

    • Fedora

...

    • object

...

    • as

...

    • a

...

    • subject

...

    • could

...

    • be

...

    • stored

...

    • in

...

    • RELS-EXT

...

    • or

...

    • in

...

    • an

...

    • arbitrary

...

    • RDF

...

    • datastream.

...

    • Do

...

    • we

...

    • restrict

...

    • fedora-model

...

    • and

...

    • fedora-system

...

    • predicates

...

    • to

...

    • RELS-EXT

...

    • and

...

    • RELS-INT?

...

  • Supporting

...

  • updates

...

  • to

...

  • XML

...

  • datastreams

...

  • that

...

  • get

...

  • converted

...

  • to

...

  • RDF

...

    • eg

...

    • updating

...

    • DC

...

    • through

...

    • relationship

...

    • API

...

    • methods

...

Support

...

for

...

dereferencable

...

http

...

URI

...

resource

...

identifiers

...

in

...

relationships

...

Fedora

...

resources

...

are

...

currently

...

identified

...

using

...

the

...

info:fedora

...

namespace.

...

If

...

resource

...

identifiers

...

are

...

exposed

...

as

...

dereferencable

...

http

...

URIs

...

using

...

the

...

REST

...

API

...

URIs,

...

it

...

would

...

be

...

useful

...

to

...

support

...

these

...

identifiers

...

in

...

relationships.

...

Ie

...

the

...

ability

...

to

...

query

...

and

...

manipulate

...

relationships

...

using

...

both

...

the

...

info:fedora

...

namespace

...

for

...

Fedora

...

resources

...

and

...

the

...

http

...

REST

...

URIs.

...

REST

...

API

...

  • Provide

...

  • the

...

  • ability

...

  • to

...

  • query

...

  • and

...

  • manipulate

...

  • relationships

...

  • using

...

  • the

...

  • REST

...

  • API

...

  • http

...

  • URIs

...

    • Maybe

...

    • a

...

    • URL

...

    • query

...

    • string

...

    • parameter?

...

    • scope=local

...

    • for

...

    • info:fedora,

...

    • scope=global

...

    • for

...

    • http

...

    • URIs?

RISearch

  • Query using either info:fedora

...

  • URIs

...

  • or

...

  • the

...

  • REST

...

  • API

...

  • http

...

  • URIs

...

  • Return

...

  • results

...

  • using

...

  • either

...

  • info:fedora

...

  • URIs

...

  • or

...

  • the

...

  • REST

...

  • API

...

  • http

...

  • URIs

...

  • Some

...

  • form

...

  • of

...

  • query

...

  • re-writing

...

  • and

...

  • result

...

  • set

...

  • rewriting?

...

  • RISearch

...

  • query

...

  • string

...

  • parameter

...

  • to

...

  • determine

...

  • the

...

  • form

...

  • of

...

  • identifiers

...

  • to

...

  • use?

...

A

...

Spanner

...

(Wrench)

...

in

...

the

...

works...

...

Fedora

...

repositories

...

generally

...

sit

...

behind

...

some

...

form

...

of

...

user

...

interface

...

application.

...


These

...

applications

...

will

...

(in

...

some

...

cases)

...

expose

...

their

...

own

...

URLs

...

for

...

accessing

...

Fedora

...

resources

...


Should

...

we

...

instead

...

be

...

providing

...

mechanisms

...

to

...

support

...

exposing

...

these

...

URLs

...

as

...

the

...

canonical

...

http

...

URIs

...

for

...

Fedora

...

resources?

...