Date: Fri, 29 Mar 2024 10:19:44 -0400 (EDT) Message-ID: <2003264493.137.1711721984653@lyrasis1-roc-mp1> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_136_722607543.1711721984653" ------=_Part_136_722607543.1711721984653 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Table of Contents
January 15, 2016
The Linked Data for Libraries: LD4L Labs proposal is a collabor= ation of Cornell, Harvard, Iowa, and Stanford to continue to advance the us= e and usefulness of linked data in libraries. Project team members will cre= ate and assemble tools, ontologies, services, and approaches that use linke= d data to improve the discovery, use, and understanding of scholarly inform= ation resources. The goal is to pilot tools and services and to create solu= tions that can be implemented in production at research libraries within th= e next three to five years. The project team will develop tools and provide= direct support for projects within the related LD4P Program as described i= n a grant proposal from Stanford University Libraries. This proposal = seeks $1,500,000 from the Andrew W. Mellon Foundation for the period from A= pril 1, 2016 through March 31, 2018 to support this work.
The LD4L Labs project is focused on developing new tools and approaches = that will make it easier for libraries to create and use Linked Open Data t= hat describes their scholarly information resources. The project will devel= op and support tools for linked data creation and editing, the bulk convers= ion of existing metadata to linked data, and a common system to support ini= tial work in entity resolution and reconciliation. The project will also ex= plore strategies to use linked data relationships and analysis of the Linke= d Open Data graph to directly improve discovery and understanding of releva= nt scholarly information resources. Finally, the project will provide feedb= ack to the library ontology community about the use of BIBFRAME and other r= elevant ontologies within the tools being developed and in support of disco= very and understanding.
In contrast to LD4L Labs, the scope of the Stanford-led LD4P project the= establishment of the foundations of the transition of library technical se= rvices operations to ones based in linked open data. As a whole, the = six institutions will focus on four main areas of development. First = will be the establishment of the ability to create linked open data communa= lly. Second, in collaboration with external standards organizations s= uch as the Program for Cooperative Cataloging and linked data projects such= as BIBFLOW, will be the establishment of common procedures and protocols f= or the creation of library metadata as linked data. Third will be the= expansion of the BIBFRAME ontology to better encompass subject domains suc= h as art and music. And last will be the transition of a selection of= current library workflows to ones based in linked open data. The projects = will make use of a collection of preliminary tools (those developed by LC, = Vitro, and eagle-i) and adapt them for production work at their individual = environments. The feedback from the use of these tools in a productio= n environment will allow their developers to further refine their tools to = meet practical demands. As LD4L Labs enhances the suite of tools, the= LD4P partners can take advantage of the enhancements in the transition of = their workflows.
The projects propose close collaboration, with frequent joint meetings a= nd two shared staff positions that will devote different parts of their eff= ort to the two grants. In addition, tools being developed by LD4L Labs will= be directly used in the metadata production work of some of the LD4P partn= ers. Conversely, the LD4P linked data and use cases will also directly info= rm the development of LD4L tools. Details of these collaborations are discu= ssed in the individual sections below.
In the initial Mellon-funded Linked Data for Libraries project[1], the LD4= L team gathered data, assembled an ontology, and built some of the basic in= frastructure to share linked data about scholarly information resources, su= ch as traditional monograph and journal publications, archival materials, r= esearch datasets, images, recordings, cultural artifacts, newspapers and ma= gazines, web archives, and much more. This infrastructure included the crea= tion of a shared processing pipeline that converts existing MARC[2] catalog r= ecords at the three partner institutions to linked data, together with pre-= and post-processing steps to make this linked data more useful and uniform= across the partners.
The linked data created for these scholarly information resources includ= ed bibliographic data, curation data (e.g., metadata related to the organiz= ation and annotation of the resources), data expressing how the resources r= elate to people and organizations, and usage data. While bibliographic data= is typically directly specified by catalogers, curation data is more commo= nly the byproduct of a scholarly activity. Curation would happen, for examp= le, when a resource is included by a museum or archive as part of a focused= exhibit; when a resource is added to a course reading list by a professor;= or when resources show up together on a focused bibliography. The data tha= t is captured as a byproduct of these activities is typically represented a= s a virtual collection or annotation and will be referred to within this pr= oposal as =E2=80=9Ccuration data=E2=80=9D.
The project developed a set of use cases[3] focused on creating relationsh= ips from those resources to real world entities (represented as linked data= URIs) that provided context for, and enhanced understanding of, those reso= urces. Based on those use cases, the project team assembled an ontology[4], d= rawing together elements from a variety of existing ontologies, including t= he Library of Congress (LC) BIBFRAME[5], VIVO-ISF[6], Open Annotation[7], OAI-ORE= [8], and several others. In the case of BIBFRAME, the project made a number o= f modifications to the original proposed ontology and provided those change= s and the rationale behind them to LC in a report authored by Rob Sanderson= with contributions from other LD4L team members. LC has indicated that mos= t of these recommendations will be included in the next revision of BIBFRAM= E.
In February 2015, the project held an LD4L Workshop[9], assembling fifty p= articipants to provide feedback on the use cases, ontology, and planned dem= onstration implementations. That workshop provided immediate guidance to th= e project and also made significant suggestions for follow-on work, many of= which are reflected in this proposal. Results from the workshop and from t= he overall project have been publicly presented in a number of forums[10], i= ncluding the DLF Forum in Vancouver, BC in October 2015.
The project is finalizing the process to make available Linked Open Data= [11]<= /a> representing approximately 23 million cataloged scholarly information r= esources from Cornell, Harvard, and Stanford using the LD4L ontology. This = public data will be available in 2016 through the ld4l.org website. The initi= al release will reconcile works based on common OCLC Work IDs[12] and people= based on common VIAF (Virtual International Authority File) identifiers[13]= across that linked data. It will also provide Stackscore[14]-based usage da= ta on the resources from Cornell and Harvard, where a Stackscore is simply = a number from 1 to 100 that represents how relevant an item is to the libra= ry=E2=80=99s patrons as measured by how they=E2=80=99ve used it. The number= can reflect data from circulation, number of copies held, browse counts, o= r other library-specific usage information.
The use of standardized and reconciled linked data representations for s= cholarly information resources across all the partner institutions allows t= hese resources to be treated as one comprehensive collection. Having such a= common collection means that tools, relationships with and analysis of the= linked data graph, and new linked data sources can and will build on the s= cholarly information resources from all the partner institutions, not just = on resources from a single institution. This approach also means that the t= ools and approaches being developed will be reusable, and allow inclusion o= f resources from any institution that makes information available as compat= ible Linked Open Data. Through linked data, alignment to shared identifiers= by any institution also means that references to external entities, includ= ing global identifiers for people, organizations, concepts, events, places,= and other types, will enrich much more than that single institution's data= .
The first Linked Data for Libraries project, funded by the Andrew W. Mel= lon Foundation in January 2014, presented an extensive discussion of the im= portance of linked data and semantic web technologies for making scholarly = information resources at academic libraries more discoverable, understandab= le, and usable[15= ]. That discussion will not be repeated here, but it is imp= ortant to note that since that proposal was funded, there has been growing = interest in and use of linked data by the library, archives, and museum com= munity, as reflected in many workshops, conference presentations, and publi= cations.
Cornell University Library (CUL) is itself pursuing several other linked=
data efforts in addition to the work proposed here. In particular, CUL, in=
partnership with the Library of Congress, OCLC, the Program for Cooperativ=
e Cataloging, and Harvard University Library, has requested funds from IMLS=
to hold a national forum on issues concerning local authorities in library=
metadata, with a focus on name identities. CUL is also part of the NSF-fun=
ded EarthCollab project[16], a partnership with the National Center for Atmo=
spheric Research and UNAVCO to use semantic web technologies to better desc=
ribe the interconnected network of research datasets, publications, researc=
hers, experiments, and research instrumentation. Finally, CUL has long been=
part of the VIVO project[17], focused on using linked data to describe the =
full academic context (e.g., publications, teaching, grants, departments an=
d programs, and research projects) for researchers and scholars.
CUL has also recently committed to adopting the Open Library Environment= (OLE) as its integrated library system. CUL staff will work with LD4L Labs= team members to evaluate the potential to integrate the tools and approach= es described in the Project Description section below with the OLE environm= ent. In doing so, CUL will seek to collaborate with and build on the work o= f UC Davis on BIBFLOW<= sup>[18]. BIBFLOW is an IMLS-funded project focused on =E2= =80=9Ca research agenda and a set of activities=E2=80=9D to help the commun= ity understand its resource landscape and develop a roadmap for the coming = years to move its technical services workflows into ones based in linked da= ta.
The original LD4L project has led to two separate but closely related fo= llow-on proposals. This proposal focuses on building on and extending the r= esearch, tool development, experimental pilots, and infrastructure work of = the prior LD4L project. The goal of the LD4L Labs work is to develop soluti= ons in the specific areas described in the Project Description section belo= w that could be piloted within the term of the grant and implemented on a p= roduction basis in research library environments within the next 3-5 years.=
The goal of the companion Linked Data for Production (LD4P) project bein= g submitted by Stanford University Libraries is:
=E2=80=9Cto begin the transition of technical services production workfl= ows to ones based in Linked Open Data (LOD). This first phase of the transi= tion focuses on the development of the ability to produce metadata as LOD c= ommunally, the enhancement of the BIBFRAME ontology to encompass the multip= le resource formats that academic libraries must process, and the engagemen= t of the broader academic library community to ensure a sustainable and ext= ensible environment.=E2=80=9D
While the LD4P and LD4L Labs projects have separate agendas and goals, t= here is a great deal of synergy between the two efforts. In particular, sev= eral of the LD4P efforts make use of tools being developed and supported as= part of LD4L Labs, and the linked data created through the LD4P projects w= ill be used by several other efforts within LD4L Labs focused on improving = discovery, use, and understanding of scholarly information resources.
The participants invited to the LD4L Workshop in February 2015 made a nu= mber of recommendations on how to advance the use and usefulness of linked = data in the library community. They also identified some specific challenge= s that the library community must address if it is to successfully move to = a linked data infrastructure. Here is a summary of some of the recommendati= ons and related challenges:
The LD4L Labs proposal seeks to make progress in addressing both the rec= ommendations and the challenges identified in the LD4L Workshop, as well as= those identified by the project itself and the broader library community. = The goal is both to advance the overall linked data agenda and to provide a= set of specific tools, solutions, and approaches that can provide producti= on-level value to libraries within three to five years and demonstrate the = value of linked data to solve real problems for research libraries, scholar= s, and students.
[12]<= /sup> https://www.oclc.org/developer/develop/linked-data/worldcat-enti= ties/worldcat-work-entity.en.html
[15]<= /sup> =E2=80=9CWhy Linked Data?=E2=80=9D at https://www.ld4l= .org/linked-data