You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

This page provides additional details around the integration of information from Wikidata, DbPedia, and the Library of Congress (LOC) linked data service into the Cornell production catalog.  We began this work as part of  LD4P2 and LD4P3 discovery explorations around knowledge panels and author and subject pages.  After multiple discussions with our library user representatives and modifications based on their feedback, we coordinated with the Discovery and Access team, the group responsible for our catalog implementation, to move forward with this integration. 

Examples and screenshots

You can see an author knowledge panel example at: [insert link].  Clicking on “author” info will open up the knowledge panel.  Clicking on “view full info” takes the user to this page [insert link].  A subject page example can be seen at: [insert link]

Data integration

  • Knowledge panels
    • We now include an image and short description from Wikidata.  
  • Author and subject browse pages:
    • We now include
      • An image from Wikidata
      • A description from DbPedia (or a description from Wikidata if the former is not available)
      • Additional Wikidata information for people such as citizenship, education, and pseudonyms
      • The LOC classification number which is used to generate a link to our call number browse.

Implementation

All of the data on the page is retrieved using dynamic lookups against external data sources to retrieve information.  We also check against our configuration to see which data or authorities should show information from external sources. (This process is described in greater detail below).  Currently, the author knowledge panel and author and subject pages rely on a string lookup of the authorized heading string against the id.loc.gov suggest service.  This lookup returns a URI which we then use to retrieve LOC information such as the classification number and to query Wikidata.  DbPedia queries use a combination of Wikidata QID and label searches to retrieve descriptions.  Additionally, we have requested the inclusion of LOC identifiers or URIs within the catalog Solr indices and, in the future, we hope to be able to use that information directly instead of relying on string lookups to get LOC URIs .

When we don't want to show specific data

We implemented a system whereby we can exclude information from being displayed for specific authors or subjects.  Using a YAML file, we can designate whether we wish to hide all external data or whether we wish to hide certain properties for a specific author or subject heading. 

For example, if we want to the image and description for the heading "Eliot, George, 1819-1880" but wish to display the remaining properties we retrieve from Wikidata, such as citizenship and place of education, we would include the following in the YAML file:

? "Eliot, George, 1819-1880"
:
   - "image"
   - "description"

If we did not want to show any of the information we retrieve from Wikidata or DbPedia for this specific heading, we would instead write the following in the YAML file:

? "Eliot, George, 1819-1880"
: ~



What happens when external data services are not available

If the external lookups or APIs are not functioning, we try to ensure the knowledge panel and author and subject pages still load with the information from our own browse indices.

Additional design changes

We also updated aspects of the design of both the author knowledge panel and the subject and author pages.  We incorporated library holdings search information into the author and subject pages to allow users to more easily find related information and resources.  

Code

Thanks and acknowledgements

Special thanks to Tim Worrall, our lead developer on this work.  Many thanks as well to our library catalog user representatives and to the entire Discovery and Access team.  Thanks to Frances Webb who provided insight into our Solr indices and to Melissa Wallace who provided feedback around design.

On the LD4P3 front, thanks also to our discovery team (Discovery On the Ground), including but not limited to: Astrid Usong for UX work and contributions, Greg Delisle for server infrastructure, Steven Folsom and Jason Kovari for metadata feedback, Tim Worrall for development and design work, Michelle Futornick for contributing to discussions, Hilary Thorsen for her query pointers, and Dave Eichmann for allowing us to query his data servers. As always, thanks always to our LD4P3 project directors.

  • No labels