Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Link to original doc

Usability/User evaluationsevaluation High Level Takeaways

  • Auto-suggest is a useful feature which could benefit from additional support for misspellings.  Furthermore, the label “author” could be revisited to clarify what the list indicates (i.e. works by versus works about). Potential exploration could be undertaken around how to provide a mechanism to distinguish between results (especially if they have the same string value) to enable users to know which result to select.  
  • Related person and subject suggestions have the potential to be useful, but additional steps could be taken to clarify how these entities are related as well more clearly lay out the knowledge panel design to display supplemental information such as contemporaries for authors.  Furthermore, labeling could be improved for related concepts. For example, we could use the term “similar” instead of “close” for closely related concepts. Additionally, clarifying that the suggestions are clickable could be clearly indicated using a pointer above the item. 
  • Most participants indicated zero-results suggestions were a useful feature.  Some participants indicated that the highlighted snippets provided useful context around how these suggestions related to the person’s query.  Additional labeling changes could be made to clarify that the subjects on the left are suggestions and that the results displayed are related searches.  
  • For details, please see Tasks, SMASH! Usability Testing notes with more information around high level takeaways and screenshots of the system
  • High level takeaways
  • References: Tasks, Raw notes


Data, APIs, and Indices

  • Autosuggestions were based on results from the author and subject headings used in the catalog, with facet counts being used for scoring.  Earlier experiments included connections to the VIAF AutoSuggest API, to the VIAF data in DAVE, and to the LCNAF API.
  • Zero results pages use data from Google Books API to get results based on full text search using the entered query.  Additionally, we retrieved HathiTrust XML for a specific query to display a proof of concept using subject headings and search results based on full text search in HathiTrust based on the entered query.  Since then, HathiTrust has provided IP address-based access and access for specific user accounts to an XML version of results for any given query. We have integrated this URL into the code.   
  • Subject and author suggestions displayed on the left-side of the page relied on the following:
    • Subject search:
      • LCSH search through QA (cached) with partial or complete string matching.  This search takes into account both preferred and variant labels.
      • Subject facet values from the catalog for search results for that query.  Queries to id.loc.gov retrieved the URIs for the subjects based on the authorized subject headings in the facet values. 
      • Annif recommendations based on the query which return both the label for the LCSH heading as well as the URI.  (Details on our use of Annif are recorded here). 
      • For LCSH and FAST headings, requests for the URL for that subject to retrieve information about that entity including broader, narrower, and close match/related headings.  If any of the subjects have a close match with a Wikidata URI, the Wikidata link is also included.  
    • Author search:
      • LCNAF search through QA (cached) with partial or complete string matching.  This search takes into account both preferred and variant forms of names. 
      • Author facet values from the catalog’s search results for that query.  Queries to id.loc.gov retrieved the URIs for the authors based on the authorized heading strings in the facet values.
      • Contemporaries within the person details view were retrieved from a query to the author index used in BAM! which captured birth dates, death dates, start activity dates, and end activity dates. When the user clicks on the author result, the first call to this index is to retrieve the appropriate Solr document containing birth/death info.  The subsequent call is to the index is to retrieve any individuals whose birth date or death dates, whether from the Library of Congress or Wikidata, fall within the range of the birth to death dates for the author. For the dates for the main author (or the author whose details are being viewed), Wikidata dates are preferred over Library of Congress dates where available. 
      • Wikidata queries to retrieve image and schema:description for authors using the Library of Congress URI to query for the corresponding Wikidata entity. 

...

  • Since it’s currently implemented with a JSON file as the data source, the autosuggest field could be re-engineered to drive the query off of a Solr index. The source for the index could again be the author and subject facets from the catalog, only it would include all of the facet headings from the catalog, not just a subset. Also, we could possibly enhance this by including name variants from LCNAF or VIAF.
  • The Zero results pages pages’ initial example integration of HathiTrust relies relied on a static XML file and a subsequent implementation used an IP address-based and specific user account-based access approach.  If HathiTrust develops an a more general-purpose API that enables retrieving search results as well as subject headings in a structured data format (such as XML), then the code could utilize that API instead to retrieve and display information from HathiTrust.  
  • Additional work with Annif could generate a more comprehensive training data set using more (or all) records from the catalog.  A possible path is to utilize an internal tool for querying the MARC to get all fields that contain subject headings and also retrieve any associated URIs. 
    • Currently, the Annif implementation retrieves URIs for the subject headings by querying against id.loc.gov’s LCSH data with the subject heading string.  Since not all subject headings within the MARC catalog will have URIs, a combination of retrieving any URIs from MARC where they do exist and the approach of querying strings to retrieve URIs could be used to get a more comprehensive set of LCSH URIs associated with a catalog resource. 
    • We started with LCSH headings but we could also use the FAST vocabulary and training data that uses the FAST subject headings associated with titles in the catalog.  
    • Our set of LCSH data relies on what was available within LD4P2’s QA cache for LCSH.  Our catalog appears to use some subject headings that are more recent and did not map to anything within the cache (since there have been additional LCSH headings added after the data used in the QA cache).  Future work could include 
  • For the suggested person and subject searches, we tried to incorporate some suggestions from our colleagues at Cornell as well as from Astrid Usong at Stanford.  Suggested person and subject searches would benefit from (a) further exploration into which dimensions of relevance would be most useful for users and how to ascertain and evaluate those dimensions and (b) UI design and implementation that supports users in accessing these suggestions.  Of note, here are mockups that could be reviewed when designing updates to the UI.