Services on linked data

LD4L Workshop Breakout Session, Tuesday, February 24

Risk of not knowing what to search for

may be addressed by

Providing discovery endpoints and what they hold
- ‘hardened’ SPARQL endpoints may be less prone to down time – e.g., Fuseki documentation states that "authentication and control of the number of concurrent requests can be added using an Apache server"
standard extracts and publishing starting points with examples may examples and standard extracts may help
- emulate Social Explorer http://socialexplorer.com as a way to query the contents of a larger data source, in that case census data
- the linked data fragments technology (http://linkeddatafragments.org) may facilitate hosting linked data without the server-side overhead and risk of a public SPARQL endpoint
VIVO/Vitro 'rich export' – augmenting standard linked data responses with standard queries
- e.g., get all a person's publications from a single request rather than client having to issue multiple requests
Semantic Web crawling leveraging HTML web crawler experience

Synchronizing harvested information

Risk of harvested or aggregated information going out of sync
risk of not knowing what to search for
publish starting points & examples of queries and/or canned responses
Desire to be able to query on different axes
- e.g., query OCLC Works by VIAF identifier to get a list of works by that author
Reconciliation services
- not necessarily centralized or monopolies
- would work best in an iterative mode
reconciliation services — not necessarily monopolies or centralized
iterative
- , with curation and provenance to manage difference of opinion (or evidence)
- incorporate feedback from users
- need protocols – could leverage a common API for reconciliation building on the OpenRefine API — specify as much metadata as you have, get ranked results back
mashup tools that test connections
- surface (publish) the results
- sameAs
website
- .org
validation
Validation
- RDF data shapes working group
DMCI
- DCMI tutorial on RDF validation
- Linked data needs mashup tools that test connections and illustrate bringing data together
Ontology extension mechanisms
-
- Schema.org
query on different axes — query OCLC by VIAF id to get works
- extensions being proposed and managed on GitHub
  - Bib Extend group and BiblioGraph
Ability to push bookmarks
- Small
ability to push bookmarks but as small
- graphs of data, consumable by others
semantic web crawling
bookmark
- , to a platform similar to Mendeley but not limited to bibliographic material
- A
a
- service where I can push the results of my search, organized by topic
a sort of Mendele but for everything
add it
- Add things to a collection I have
similar
- Similar to an annotation service
you
- You search, you refine it, you step back — now only save as bookmarks at one level
nobody
- Nobody can use your
bookmarks
2
a tool that would facilitate entity reconciliation
to put together UN and LC
a first pass, then improve that manually, then 2nd iteration
then publish — surface
manage difference of opinion
provenance
exclude some
- web bookmarks now
centralized entity mapping
feedback by users on the mapping
need protocols
want to discover annotation — known servers with protocols
collections have been done by many different places
if we do linked data, my list is a list of URIs from many sources
on the UI won’t see that
assuming accessible SPARQL endpoints
3
other cleanup tasks — validation? consistency of ontology use
entity recognition — text mining or analytics for tools — autotaggers
4
constant crawling graphs of linked data
semantically aware web crawling — is it worth going down this path, what’s attached, what has changed
5
provenance space — who’s made a particular assertion for that
in the library domain, could imagine a layer about who’s responsible for an assertion
unspecified.
crowd sourcing — as move up toward the general public, typically track less who did it
variable credibility
acknowledge that
nanopublications
===== group 4 ====
reconciliation services — contains no data, queries a distributed set of resources
individual libraries will become the authorities for special collections — items, people, events
queries to a central area would find a match
cache the sameAs so don’t have to re-query
everybody who consumes has the cross-links
the sort of thing that OCLC might end up doing —
could be any type of object — logical to start with works
brings up the questions of the degrees of sameAs ness
when a new match is known, publish that — a notification mechanism
you would provenance those links to indicate where came from
used to be a plug-in for Netscape where a side-wiki and annotate — anybody could see what everyone else had done
now in the world of unique identifiers — a linkerator - for people to rank what they see
build up ant trails over time, around an object
how to make it in any way central — get it to the browser
how about the annotation example?
regular expressions against EAD for an object to suggest what they link to
feed into a system to validate
then give pointers to the link
other levels of relationship than sameAs
over time it would aggregate and
a clustering algorithm — the more a link is traversed, the space reduces
emergence sorting
software crawling the graph - how do you figure out what to trust? the world according to professor X or Y
trust is very tricky
a page rank algorithm for linked data — more for asserters
strenghthen the nodes to repeat confidence
repeating assertions in multiple repositories — I agree with them, the +1 or thumbs up
Reddit gets a lot of traction
nanopublications
if you reify assertions — to add confidence where have more knowledge or curation
confidence levels
wikipedia has a way to accept
no confidence in semantic search engines
too siloed
visualizations have to be crafted

Page tree

Versions Compared

Old Version 1

New Version 2

Key

Services on linked data

Risk of not knowing what to search for

Synchronizing harvested information

Desire to be able to query on different axes

Reconciliation services

Validation

Ontology extension mechanisms

Schema.org

Ability to push bookmarks

Page tree

Page History

Versions Compared

Old Version 1

New Version 2

Key

Services on linked data

Risk of not knowing what to search for

Synchronizing harvested information

Desire to be able to query on different axes

Reconciliation services

Validation

Ontology extension mechanisms

Schema.org

Ability to push bookmarks