Original Google Docs link

User research and usability overview

High level results from earlier evaluations as well as mockup evaluations we did at Cornell indicate knowledge panels that represent contextual information as well as related relationships that can lead to further searches in the catalog can be useful.  People’s perspectives on which pieces of information are deemed relevant for context and for additional links to search appear appeared to differ. For example, there were differing perspectives on the usefulness of including links to books by authors who had won one of the same awards as the author whose knowledge panel was being viewed.   The integration of digital collections information was regarded positively although not all KPAOW mockup evaluation participants knew what content was entailed by digital collections. Future work could focus on differences between knowledge panel layouts (e.g within page or visible by default or available on clicking) as well as continued exploration of which properties and linkages within the knowledge panels would be considered useful.  There may also be a connection between knowledge panels and browsing where browsing could occur across a collection of knowledge panels. 

More in-depth information about the review of articles, user interviews, and usability results can be found here: https://wiki.duraspace.org/pages/viewpage.action?pageId=101783940 .

KPAOW mockup evaluation results are available here: https://docs.google.com/document/d/1BFWtki1LfmHsTtz6C25M9ZNv1GmJTN_VVg-fa34E6Ys/edit

Data sources, APIs, and analyses

The data sources selected and used for retrieving data or making connections for KPAOW were LCNAF, LCSH/FAST, DbPedia, Wikidata, and Discogs. Additionally, we set up a proxy controller within the Blacklight app to directly access the Digital Collections Blacklight catalog search results as JSON.  We also looked at using OCLC Work Ids from the catalog to find the appropriate Wikidata Ids. Details can be found in the development-related results below. A quick summary of data sources and where they were used is found in the table below.


Source

Used for

Overview of properties and queries employed

Id.loc.gov (LCNAF/

LCSH)

Knowledge Panel

Retrieving URI/local name based on authority string

Search Suggestions

Retrieving terms to populate suggested searches list on catalog search page based on query 

http://id.loc.gov/authorities/[auth]/suggest/?q=” where [auth] is either “names” or “subject”

Wikidata

Knowledge Panel

Retrieving Wikidata entity based on LC local name

Retrieving image URL, “notable works”, “influenced”, “influenced by”, and “awards received” properties for a Wikidata entity

Related works

Retrieving derived works and editions for the work entity identified by OCLC Work ID in the catalog

Search Suggestions

Retrieving terms to populate suggested searches list on catalog search page based on query 

Item view

For Discogs examples, the Wikidata entity related to the work was also retrieved and used to retrieve the previous and following works for a particular item

Use wdt:P244 to retrieve Wikidata URI based on LC local name

Properties wdt:P800 (notable work), wdt:P737 and wdt:P737 (influence relationships), wdt:P18 (image)

Properties wdt:P629 (editions) and wdt:P4969 (derivatives)

Previous/following works wdt:P155 and wdt:P156

Discogs master ID

wdt:P1954

DbPedia

Search Suggestions

Retrieving terms to populate suggested searches list on catalog search page based on query 


Discogs

Item view

Retrieving Discogs properties to include within the item view for a particular search result.  Discogs was searched based on item label/title.  


Digital Collections

Knowledge Panel

Retrieving digital collection results using a keyword search to Digital Collections Blacklight instance using the authorized heading for name or subject heading for the knowledge panel

Subject heading facet

Retrieving Digital Collections subject results using the subject heading selected in the faceted search. 

Subject heading facet search uses “subject_tesim” as the Digital Collections facet


OCLC Work Ids appear to have changed format from what is captured in Wikidata and what are the most current versions.  Having updated IDs in Wikidata and/or the catalog as well as having more OCLC work id to Wikidata connections would enable for greater linkages and the possibility for bringing back more information.  

Having any URI connections already saved within the Solr search index would allow for more easily obtaining identifiers to retrieve entity level information dynamically within the page.  The code had to execute queries that first retrieved the LOC entity local name (i.e the last part of the URI) by searching against id.loc.gov and then conducting a second search against Wikidata to find relevant connections.  The first search against id.loc.gov utilized a text string search which also took care to remove trailing spaces, hyphens, and periods to ensure better results. The result of this search was then used to match against possible URIs in Wikidata.  Having the appropriate URIs present in the search index would enable us to bypass these two queries.  

Development-related results

    Successfully supplemented the catalog metadata for some musical recordings by bringing in information from the Discogs database. All ajax calls to the Discogs API went through the Authorities Lookup. Those initial search queries, however, had to rely on keywords rather than IDs because the catalog index does not contain any Discogs IDs or URIs. As a result, the criteria established to prevent mismatches between a work in the catalog and a Discogs record would result in the rejection of what were actually valid matches between the catalog and Discogs. Similarly, some slight differences between the catalog and Discogs metadata also prevented pairing up a work in the catalog with the corresponding work in Discogs.

    Using the OCLC work IDs in the catalog index, for a given work we linked to Wikidata to display lists of both derivative works (for example, film adaptations of a work) and significant editions of the work. In a couple of cases, the OCLC work ID that the catalog contains for a work (such as The Adventures of Tom Sawyer) did not correspond to the work that Wikidata has for the same OCLC work ID (The Adventures of Huckleberry Finn). More significantly, only a small percentage of the items in the catalog -- ~ 0.2% -- have matches in Wikidata..

    Modified the catalog's main search feature so that we take the search term and query it against Wikidata, DBpedia and the Library of Congress subject authority. The results of those queries are combined into a single list of terms, with duplicates removed, and then we search each term against the catalog. If there's a match to the catalog, the list of terms is then displayed to the user as possible related or "alternative" searches. There are two primary drawbacks here: first, the time it takes to do these queries is excessive; and second, the results can contain terms or subjects that are not actually related to the initial search term.

    Added a “popover” box that can be activated from an icon appearing next to author and subject lines in item record view. The box provides wikidata information when available, including birth and death dates and locations, occupation, and an image. If the entity is an author, notable works and influences by and for that person appear in the knowledge panel. Finally, the knowledge panel provides links into Cornell’s Digital Collections when related results or contributors can be found.


Possible development enhancements

The objective for this set of code was to serve as a proof of concept or demo for the integration of linked data in the library catalog.  Our focus was on functionality. Much can be done with respect to refactoring both the code and the employed SPARQL queries as well as adding tests to help make this code production ready.  That said, the code can serve as a starting point for future enhancements and/or further experiments.  

Possible future work could include:

Work to extend the browse by subject implementation will be explored in the next work phase (BAM!).