Date: Thu, 28 Mar 2024 18:28:30 -0400 (EDT) Message-ID: <1549369905.29070.1711664910031@lyrasis1-roc-mp1> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_29069_1930994164.1711664910031" ------=_Part_29069_1930994164.1711664910031 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Go to LD4L Wiki Gateway
Archived
LD4L 2014, whic= h was the Linked Data for Libraries original grant running from 2014-2016, = has been completed. This page is part of the archive for that grant.<= /p>
The Semantic We= b is a term coined by Sir Tim Berners Lee in a seminal article[1] published in Scientific American in= 2001. Berners Lee articulated a vision of a World Wide Web of data that ma= chines could process independently of humans, enabling a host of new servic= es transforming our everyday lives. While the paper=E2=80=99s vision of mos= t web pages containing structured data that could be analyzed and acted upo= n by software agents has not materialized, the Semantic Web has emerged as = a platform of increasing importance for data interchange and i= ntegration through the growing community implementing data sha= ring using international semantic web standards[2], called Linked Data[3].
There are many = current examples of the use of semantic web technologies and Linked Data to= share valuable structured information in a flexible and extensible manner = across the web. Semantic Web technologies are used extensively in the life = sciences to facilitate drug discovery by finding paths across multiple data= sets showing associations between drugs and side effects via genes linked t= o each.[4] The New York Time= s has published its vocabulary of approximately 10,000 subject headings dev= eloped over 150 years as Linked Data and will expand coverage to approximat= ely 30,000 topic tags; they encourage the development of services consuming these vocabularies and linking them with other online r= esources.[5] The British Bro= adcasting Corporation uses Linked Data to make content more findable by sea= rch engines and more linkable through social media; to add additional conte= xt from supplemental resources in domains like music or sports; and to prop= agate linkages and editorial annotations beyond their original entry target= to bring relevant information forward in additional contexts.[6] The home page of the United States = data.gov site states, =E2=80=9CAs the web of linked documents evolves to in= clude the Web of linked data, we=E2=80=99re working to maximize the potenti= al of Semantic Web technologies to realize the promise of Linked Open Gover= nment Data.=E2=80=9D[7]
Linked Data is = a publishing paradigm for making data and not just human-readable documents= fully accessible and inter-linkable anywhere on the Internet. = Linked Data uses the same common Web communications protocols as ord= inary browser software to connect machine-readable data across distributed = computers. In 2006, and updated in 2010, Berners Lee described a 5-star rat= ing system for published data to be considered Linked Data[8]:
While Linked Da= ta can be used internally within an institution or across a collaborative g= roup, it becomes much more valuable when it is Linked Open= Data, and is publicly shared using an open license such as the Cr= eative Commons CC-BY[10] or= CC0[11] licenses, or the Un= ited Kingdom=E2=80=99s Open Government License[12]. For our Linked Data for Libraries project, our inte= ntion is that all SRSIS instances will share Linked Open Data with the worl= d.
Linked Data con= forms to a common data format known as the Resource Description Framework, = or RDF[13]. Each unit of Linked Data expressed in RDF has a subject, predicate, and = object comparable to the simple sentence structure of human language. All s= ubjects, predicates, and objects (other than simple data values) are encode= d or represented as uniform resource identifiers, or URIs[14], intended to be resolvable as uniform r= esource locators (URLs)[15] = on the Internet. Just as typing the URL of an ordinary web page in a browse= r should produce an HTML (Hyper-Text Markup Language) document in response,= a Linked Data request should trigger a response in the form of one or many= RDF statements, known informally as triples. Triples may be served from st= atic data files or generated on the fly from data stored as XML or in a rel= ational database; a native semantic web application persists its data in a = type of database optimized for storing and querying semantic triples, known= as a triplestore.[16]
Understanding w= hat these data refer to requires another key component of the Semantic Web = =E2=80=93 a way to encode meaning. An ontology declares = a set of defined types and relationships that are referenced within Linked = Data to express what is being referred to and the nature of the relationshi= ps involved. Ontologies are shared as broadly as possible to reduce the fri= ction and overhead of interpretation. The Friend of a Fr= iend ontology[17], for examp= le, declares a type foaf:Person that is universally recognized in = Linked Data as a reference to a human being. Ontologies are not limited to = the simple hierarchical classification structures of a taxonomy or partonom= y or to the small set of broader/narrower/related relationships typical amo= ng terms in a thesaurus; we can express, for example, that one person k= nows another person or a work of fiction has as primary source a historical document.
Software applic= ations consume Linked Data from multiple sources =E2=80=93 where that Linke= d Data has been structured to describe known people, places, organizations,= events, and associated concepts and relationships =E2=80=93 as the foundat= ion for new, dynamic, information-rich services. These services can request= Linked Data via the Hypertext Transfer Protocol (HTTP)[= 18] without knowing anything about the internal= storage schema or application programming interface (API)[19] of a remote data source, and the respon= se arrives via HTTP and conforms to a standard data format, RDF =E2=80=93 a= stark contrast to data silos accessible only to those with the keys and a = detailed map to their individualized contents or APIs.
In addition to = breaking down silos, Linked Data has also, through its fundamental dependen= ce on ontologies, charted new ground in practices for data description, or = metadata. Changes to metadata practice driven by the adoption of Linked Dat= a can best be summarized as making once implicit statements explicit. Decla= ring the subject of every metadata statement with a URI as its identifier a= nd using defined types and properties (also specified by URIs) for expressi= ng the content of metadata in RDF eliminates much of the ambiguity in what = is being referred to, in where the intended meaning has been defined, and i= n how the information referenced can be directly accessed. The ability to s= upport explicit references also encourages the practice of converting =E2= =80=9Cstrings to things.=E2=80=9D A string is just a sequence of characters= to interpret or match as best one can to other strings; a Linked Data URI = is a reference to any amount of structured information with the potential t= o guide interpretation or feed automated processes. For example, compare th= e information content of =E2=80=9CTwain, M.=E2=80=9D to the structured info= rmation in the VIAF Linked Data record for Mark Twain associated with the U= RI http://viaf.org/viaf/50566653/.
Of course, simp= ly being able to link, request, and assemble data more easily does not by i= tself produce new insights or guarantee utility. Many challenges remain in = discovering the existence of data sources; in interpreting the meanings enc= oded in the ontologies used for expressing the data and creating mappings a= mong ontologies; and in resolving multiple identifiers that may or may not = refer to the same person, place, organization, or thing. Tools are maturing= ,[20] scalability is improvi= ng,[21] and services are dev= eloping to resolve co-references to different datasets,[= 22] but the scale of the Internet is vast.
For this projec= t we will begin with a constrained set of institutions and highly structure= d library catalog records linked to authority records and Library of Congre= ss Subject Headings (LCSH)[23]. As other sources are brought into the mix, we can expect higher rates o= f uncertainty in identifying authors, keywords, place names, and terms from= multiple vocabularies. Our premise is not that we will achieve perfect ali= gnment; the goal is to make large bodies of information more discoverable a= nd interoperable than they have been, as a significant step forward benefit= ting users well beyond the bounds of our three institutions. We expect prob= lems as well as successes, and will work throughout to clarify the remainin= g challenges as the blueprint for a research agenda looking farther into th= e future.
Libraries are a= natural home for serving and consuming Linked Data and building innovative= new services. This proposal envisions a set of software tools= , ontologies, and user-facing services capable of representing, discovering= , and integrating human knowledge currently outside the confines of traditi= onal library catalogs, web pages, and online information services. <= /p>
[4= ] D.J. Wild,= et al., Systems chemical biology and the Semantic Web: what they mean for = the future of drug discovery research, Drug Discov Today (2012), doi:10.1016/j.drudis.2011.12.019