What makes a good use case?
- which of them is really linked data enabled, vs. what you could do with MARC if put in a big database
- what can be done easily with linked data, even if it could be done without
- examples tying library data together with other information
- and working across institutions
What intersections of our data sources will be strong enough to support compelling use cases?
We need to choose a key set of use cases that address the challenges articulated in the grant proposal for the project. Here are some specific points that we want to make sure that the use cases address:
- Pragmatic value. They have real value to our core constituencies: librarians, researchers, teachers, students, etc.
Community added-value. They leverage the unique value that librarians and scholars add to materials when they select, annotate, or reference the resources.
- Cross-institutional data. They clearly demonstrate the value in combining data from our three different institutions - ideally in a way that shows how that value will grow as more institutions join in.
- Leverage existing data and services. They leverage existing efforts in this space
- Integration into the Web. They show how research libraries can integrate with existing popular and useful Web sites and services, e.g., Wikipedia.
- Cross-discipline. They show examples from a variety of disciplines.
- Help core missions. They demonstrate value for teaching and learning, and research.
- Multi-data. They cover a broad range of scholarly information resource types.
- Unusual data. They show how non-traditional data can be useful.
- Media "photogenic": They clarify to the mainstream media the value of LOD and this project, and excite that media about the prospects
- They show interesting ways to use the aggregated data for analysis or visualization.
- They take advantage of data on how the materials are being used.
We need to be careful to not put in effort in areas where other projects are already working.
Clustering use cases
The purpose of this table is to group use cases into 5 clusters (columns) to help identify a small set of exemplar implementable use cases that can be the focus of engineering work. Good candidates to exemplar or to be merged to form an exemplar are marked "top".
|Goal||bib+curation||bib+person data||leverage external authorities||leverage deeper graph via queries or patterns||leverage usage data||no cluster|
use cases by cluster in priority groupings
|top: 3, 11, 24, 35 | | challenges: 18||top: 6, 9||top: 22, 23 | med: 19, 25, 28, 42 | challenges: 5, 21, 32 | low: 15||top: 12, 38 | med: 29 | challenges: 14, 17, 41 | low: 39||top: 1 | med: 16 | challenges: 10,26||challenges: 30,31,33,34,37, 27 | low 2,4,7,8,13,40|
|3||5, 22, 32||38||16||31|
|leverage existing external authorities||23, 19, 25, 42, 21, 15||38||8|
|leverage researcher networking data||6||23||7|
|leverage existing sources of LOD||22, 23, 42||38, 41|
|integration out into the Web||22, 19|
|help core missions||12, 14, 38|
|highlight unusual data||5||12, 14||1, 16|
|media "photogenic"||22, 42||12, 17, 41|
|interesting analysis or visualization||11||38, 17|
|take advantage of usage data||1, 16, 10, 26|
Bib + Curation CLuster
Build a virtual collection
Use Cases (suggestions, in draft form)
|(title)||As a...||I want to...||In order to ... (Benefit)||Comment|
|1||Research guided by community usage||As a researcher exploring a new field, or as a reference librarian||I want to find what is being used (read, annotated, bought by libraries, etc.) by the scholarly communities not only at my institution but at others, and especially to find sources used elsewhere but not by my community||I'll be satisfied when the result of a search for a subject or for a particular work makes suggestions for further exploration that are both relevant and surprising|
Demonstrates scholarly communities learning from one another across institutions
This divides up by subject; others are possible
H5 C5 S0
+ value of non-traditional data
|3||Build a virtual collection||As a faculty member or librarian||I want to create a virtual collection using online materials in multiple collections across multiple universities. I need a tool that will let me browse fluid, discovering items that otherwise would have escaped my attention, and easily creating an online exhibition. Information about user interaction with this exhibition – unique visitors, but also which items they click on, in what order, how long they spend with each – should be fed back into the universities' systems to help inform future people browsing the collections.|
Heavily focused on the UI. How central should that be to the use cases.
How much should people be able to say about the collection itself and not just its items. Or info about the items but specific to the collection.
H5 S5 C5
+ cross institution
+ pragmatic value
|5||Info-rich maps||As a student||I want to browse a geographic map and see annotations automatically added that show me relevant information (library items, archival items). I want to know that new information has been added in close to real time.||I will be satisfied when that works|
Maps are a gnarly type. And how do we get the geolocated items? And this isn't very fresh. Is there a way to show relations among the pins? Or something? - dw May be possible to do autotagging by place names but may not be reliable, and are we tagging author origin, publisher, aboutness? jcr
Perhaps restrict their viewing to a geo location.
We'd need a gazetteer.
Not linked data base
|3.3 leveraging authorities|
H4 C4 S0
If we have the data. (Harvard does.) Also, scale it back.
+ unusual data
|6||Highlight my faculty's work||As a university dean||I want to works created by my faculty to be highlighted in the OPAC. This includes works by any author who has ever worked at my college.||I will be satisfied when that information is accessible by the OPAC, although it'd be nice if the OPAC actually used that information.|
requires a commitment to maintain information on departed faculty in other systems of record, or taking on that task in the library. jcr
we'd be ok with just current data
integrates person data, so good
S5 H4 C4
Not very dramatic and may already have been done, but worth supporting
|9||Identify who an author influenced||As a researcher working on a particular academic figure||I would like to be to see the works written by the students of a particular figure. E.g., what have the people who were advised by Buckminster Fuller written? Then I'd like to see the citations for those students' works clustered by subject area.||great genealogies in math and CS. But would have to link their identities to the ones that we have.||2.2 bib+person|
|10||Assess influence by community usage||As a researcher working on a particular academic figure||When I search on Buckminster Fuller, I would like results ranked not by relevance but by "community relevance," i.e., by how often those works are used by my university's communities.||I will be satisfied with Stacklife|
Doesn't demonstrate linked data, and similar to a couple of others - dw
Population across the three institutions is big enough.
S0 H3 This is an application of #1.
|11||Build lists and make that metadata reusable||As a teacher and as a librarian||I would like to easily create a "shelf" (a list, really) of supplemental readings for my students, and have the relationship among those items – namely, that they all have some reference to the main sources – be fed back into the Library's data set so that others can benefit from my intellectual work of clustering them. I would like the provenance of that clustering maintained. This information could then be presented through a browsable discovery system. WHAT RELATIONSHIP?|
You'd have to log in. (Shelf.io does that)
A top priority for Stanford and something they want to get out of this grant
Privileged or not privileged? My personal list or public?
Great area to explore
A third party app might do this, with the three systems incorporating
|See #3 & #24||1.1 bib+curation|
Lists are a subset of virtual collections (#3)
|12||Find associated works more relevant than Amazon does||As a researcher||I would like an alternative to Amazon's "people who read this also read that." The alternative should be focused on finding results relevant to me as a researcher, rather than works a vendor is trying to sell me. In fact, I would like to see side by side lists of Amazon's clusters and my Library's clusters so I can see just how much the Library's kicks Amazon's skinny butt.||Coulod use data from Fac. finder, items put on reserve, lib guides, lists and collections||4.1 deeper graph|
+ Great for the media
+ Discovery tool
+ Shows use of new data
|14||Be notified when new archival components are uncovered||As a researcher and teacher||I would like to be notified when there are archival items that might bear on a topic I'm researching or teaching. I would like this to work at the component level, not only at the "box" level.||This would require devolving metadata from the box to the component level; Harvard has done work on this and has a set of EAD components - dw||4.3 deeper graph|
+ Unusual data
-- no data
-- Data model??
|15||Authority control||As a librarian||I would like to enter any form of an author's name and have authority control|
As it stands this use case doesn't seem to leverage our work. If the method by which one decided which "John Smith" you wanted used other information about the person, their works, their geography, etc. then it would seem to fit. Still seems a somewhat standalone tool though – sw
3.4 leveraging authorities
-Does not need our data
|16||Be guided in collection building by usage||As a librarian||I would like help building my collection by seeing what is being used by students and faculty, and what's being used at other universities||5.2 usage|
+ uses unusual data
(It's another use of #1)
|17||Use NLP to explore the Library Graph||As a researcher or student|
I would like to be able to ask questions about the relationships among library data using something like natural language (although I'd be happy to accept some common sense restrictions) and get back interesting results. For example,
"What books were written by Cornell biologists on topics that Harvard and Stanford biologists rarely write books about?"
"What books are available about communication technology that don't use 'Communication' in their title or subtitle?"
"Create a timeline of science books that charts them by how many of those books have illustrations."
"Find all the anthologies that have chapters by both Marshall McLuhan and Neil Postman."
"Create a table that shows how many genetics books were published in England per year versus how many were published in the United States."
"Find me books about Christian Fundamentalism read (or assigned) in Divinity schools in the 1990s but less so after 2000."
"Show me chronologically the usage in medical schools of books under the headings of both vaccination and autism."
"Map by publisher location the clustering of books about American slavery since 1640."
|I will be satisfied when there is at least a limited vocabulary and syntax that I can master and get back interesting results.|
(We could limit the NLP issue by creating a pick list of operations, relations, and outcomes. Or maybe there's an Open Source NLP library we could use. I dunno. - dw)
I think the NLP part of this is too much to bite off. I think it would be a reasonable goal for this project to use a structure query language to express and then answer these queries, I think the NLP part would be extremely cool but would need NLP expert collaborators – sw
Keep an eye out for an NLP person who might want to hook up an NLP front end to this, for a demo.
|4.3 deeper graph|
+Great for the media
(Remember that these are use cases, not things we're saying we'll build. And we have the data – not the NLP – to do this. I.e., we should be aiming at enabling someone else to build this.)
|19||Acquire related works||As a librarian working on acquisitions||I would like to find additional materials that build on our current holdings – additional works by the same authors or about the same topics.||I will be satisfied when I have a tool that lets me explore these ways||3.2 leveraging authorities|
S5 H5 but only if it brings in outside sources
|21||Intelligent term expansion||As a researcher or student|
I would like intelligent term expansion:
This would rely on some of the ideas of authority control #15 – sw
We could do this for authors etc. but how well by navigating the linked data we have?
|3.2 leveraging authorities||S5 H5 if we have the data|
|22||Find unexpected resources through OPAC searches||As a researcher||I would like a panel in the OPAC that shows more information about a given <author,subject> when I do a search. E.g. "Gettysburg Address" shows panel on the town of Gettysburg, Lincoln, the battle, the tourist destination. Find other items near Gettysburg's geocode and perhaps in decades surrounding the event's time.||So that I can find related resources that fall on the periphery of my search||Facilitates serendipitous discovery, not too near not too far.||3.1 leveraging authorities|
+external data LD
|23||Pivot on works to explore more contexts||As a researcher||I would like a browse interface in the OPAC that is more intuitive, more flexible and allows for easy pivots than current (legacy) browse interactions for author, title, subject, shelf location||So that I can get a sense of related works across multiple different dimensions.||3.1 leveraging authorities|
S5 H5 depending on data implications
(Similar to #22)
|24||Tag items in cross-silo ways||As a librarian||I would like to be able to 'tag' items in the OPAC into curated lists, to feed subject guides, course reserves, or reference collections; I'd like these lists to be portable (into Drupal, into LibGuides, into Spotlight! or Omeka, into Sakai, e.g.) and durable; I'd like these lists to selectively feed back into the OPAC without having to modify a MARC record.||So that I can create durable, portable lists of curated resources for many uses, without having to do cataloging.|
aka LD-powered CuLLR for SearchWorks
(Not sure what it implies for the data model)
|25||Work-based discovery||As a researcher or librarian||I'd like to do work-based discovery rather than item-based discovery for musical resources.||because I heard that FRBR would be really useful. </snark>||Hummable front end?||3.2 leveraging authorities||S0 H5 If this means someone could spin up a page at the work level, then 5. (Doing it for music is a nice touch but not essential)|
|26||See usage faceted by funder||As a bibliographer or librarian||I'd like to be able to search all books acquired by a certain fund, and see which fund paid for the acquisition of any given book, and crosswalk that to how many times each book circulated / was used in course reserves / was on a syllabi or reading list / was authored by an institutional faculty member||So that I can do queries to support collection development and management.||5.3 usage||S0 H3 we have this data. Not sure how useful.|
|27||Authorized names via auto-suggest||As a cataloger||I'd like an auto-suggest for fields when doing cataloging that gave me authorized forms of <names, places, subjects, corporate bodies> for data entry – based on pooled data||So data entry is quick, less error-prone, and authority controlled from the get go.||no cluster||S0 H same as #15|
|28||**OUT** Form-fill for faculty deposits to an IR||As a faculty depositor to an IR||I'd like an auto-suggest for fields when doing data deposit that (first) autopopulated my bio-demo data based on a unique ID (ORCID?), and (second) gave me useful, authorized forms of names, places, subjects, keywords, departments, etc. for data entry.||So data entry is quick, less error-prone, and authority controlled from the get go.||3.2 leveraging authorities|
|29||Increase the sophistication of query and display||As a researcher||I'd like to be able to visualize, pivot, do inferencing, and complex, ad hoc queries across the aggregated store of person, bibliographic, usage and curation data from three institutions|
So I can uncover relationships and knowledge that were otherwise hidden.
Maybe do this more as a specific narrative?
SPARQL-based OPAC + directory + circ reporting?
Great if we can come up with some good UI, or subset UI that does interesting things. Is there an existing UI to drop on top of a SPARQL endpoint? We don't know of any.
Make this use case more specific? And we need some concrete examples.
Can we do inferencing?? Make that its own use case?
|4.2 deeper graph|
S0 H0 This isn't a use case. These capabilties would support use cases, but these capabilities are what the project overall is about, aren't they?
|30||Find which works by an author are used in courses||As a faculty member||I'd like to query the OPAC and/or Syllabus / Reading List portal to see which of works of a given author were used in courses||So I can gauge the impact and trends in pedagogy||no cluster|
|31||Compare course usage to holdings||As a bibliographer|
I'd like to be able to query which library resources were used in courses (as noted by syllabi, reading lists or course reserves) and compare that to a collection's holdings. (This may be at my or another institution)
|So I can see if the relative strengths of collections to support teaching|
more specific case of #8
combine with 8. drop 8
|32||Finding Aids across institutions||As a researcher||I'd like to be able to search across archival holdings at different institutions, and follow the links from the records of one person or org to those of a collaborator held at an another institution.||So I can find relevant archival resources regardless of which institution holds them.||3.3 leveraging authorities|
If we have the data
|33||Integrated search of Finding Aids for archival materials||As a researcher||I'd like to be able find an emeritus faculty member's publications, teaching materials, reading lists, grant products, advisors, and archival materials in a single search.||So I can trace relevant research and teaching activities.|
What else do we have that is EAD-related?
#14 would also rely upon decomposition of EAD – sw
-how different from a faculty profiler?
|34||Finding works related to course instructor||As a student||I'd like to be able to find all the works authored by, composed by, created by, or performed by any faculty member who has ever taught the Jazz Piano/B&W Photography/Colonial Women's History course that I'm currently taking||So that I can better understand the nature of the course and various approaches to it||no cluster|
-part of faculty profiler?
|35||Finding selected or highlighted works||As a researcher||I would like to find works in a particular subject area (e.g., Civil War photographs) that have been individually selected and curated as part of a public exhibit by a museum, library, or archive||So I can find works that are likely to particularly exemplify some aspect of the subject area||1.1 bib+curation||S5 H5 Great if we have the data. (Expand to include data from virtual collections and lists)|
|37||Tracing archival relationships||As a researcher||Specifying an individual in an archival collection, I would like to identify all the individuals with whom they have corresponded and find out where, in turn, those individuals have archival collections||So that I can trace the potential impact an individual has on the people with whom he/she interacts||no cluster|
|38||Identifying related works||As a researcher||I would like to find all the costume photographs and illustrations for the plays of George Bernard Shaw||So I can see how the characters have been interpreted and visually represented across time|
I like this one as it uses relatedness along different axes combined with type taxa and and entity – sw
4.1 deeper graph
+ external data
|39||Identifying publications related to equipment or facilities||As a facility or lab director||I would like to find publications describing research that made use of my equipment or facilities||So that I can justify existing funding or advocate for additional resources|
Where is the data linking publications to equipment or facilities? – sw
|4.4 deeper graph|
|40||Identifying publications related to datasets involving research resources||As a researcher||Lacking a direct dataset citation, I would like to find candidate publications that may have used a particular dataset that involved certain resources or equipment||So that I can track the propagation of errors related to equipment calibration or otherwise determine the impact of particular questionable resources|
How do we infer dataset use where there is no citation? – sw
|41||Exploring the contemporary context of an historical source||As a researcher||I am exploring an artwork or text situated in a particular historical epoch and want to find related materials for a specified period leading up to that time, so that I can see the context of that work. E.g., For the twenty yeares before GB Shaw's Arms and the Man, what was being reported in England about the Serbo-Bulgarian war?||So that I can further my contextual understanding based on available resources||This could draw upon external data sets. E.g., NYTimes archive||4.3 deeper graph|
|42||Topical intersections of related authors||As a researcher||I would like to see the sets of topics worked on by authors within a particular field of study. E.g., assemble the authors who have written about GB Shaw and show me the other topics each has written about, so I can see over time the change in the domains within which GBS has been considered. (Maybe I'll discover that initially he was treated mainly by people writing about the arts, but during WWI he was taken up by political writers.)||3.2 leveraging authorities|