You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 42 Next »


About the Stanford Tracer Bullets

Stanford’s linked data production project focuses on technical services workflows. For each of four key production pathways we will examine each step in the workflow, from acquisition to discovery, to determine how best to transition to a linked data production environment. Our emphasis is on following each workflow from start to finish to show an end-to-end linked data production process, and to highlight areas for future work. The four pathways are: copy cataloging through the Acquisitions Department, original cataloging, deposit of a single item into the Stanford Digital Repository, and deposit of a collection of resources into the Stanford Digital Repository.

Stanford Project Proposal

Deliverables

Overview Presentations

Tracer Bullet 1: Conversion of vendor-supplied MARC records to BIBFRAME

Existing MARC workflows

Overview of existing MARC workflows

Firm orders
Firm orders

Approval orders
Approval orders

Receiving

Automatic record enhancements

MARC-to-BIBFRAME conversion workflows

Pilot workflow: parallel processing in MARC and BIBFRAME, MARC record primary

Implementation workflow: operational record in MARC, discovery data in BIBFRAME, BIBFRAME data primary

Conversion workflows by process
 

Dataflow for conversion workflow

Reactive Pipeline for Larger Scale Conversion of MARC Records

Supporting Material

Tracer Bullet 2: Original Cataloging in BIBFRAME

See overview presentations above for screenshots.


  • Workflow for original cataloging

Tracer Bullet 3: Deposit item to digital repository with RDF metadata; deposit by item creator

See overview presentations above for screenshots.

Existing deposit workflows


Metadata created in self-deposit interface (non-SUL users)
 
This diagram represents metadata only (not file management or other admin tasks) for objects described entirely through the self-deposit tool Hydrus. The boxes under "Self-deposit interface" represent the metadata-related tasks a user can perform through that interface. The leftmost column of boxes are metadata tasks contained within the self-deposit tool. The right column of boxes involves writing data to DOR. Except where otherwise specified, these tasks apply to description of both collections and items. Currently this model is more commonly used for deposits originating from non-SUL users.

Metadata created in Symphony/MARC (SUL users)
 
This diagram represents metadata only (not file management or other admin tasks) for objects that are accessioned through the self-deposit tool Hydrus by SUL staff and cataloged by the MDU in MARC. The boxes under "Self-deposit interface" represent the metadata-related tasks an internal user performs through that interface in this workflow. Currently this model is commonly used for digital files received by Acquisitions or acquired by curators. The self-deposit tool serves as a convenient way to get a file into the SDR, notify MDU staff that cataloging is needed, and pass information such as the catkey and purl for the object to MDU. New items are deposited to existing collections that have been set up with the appropriate rights, permissions, etc. 

RDF-based Workflow


The tracer bullet focuses on the metadata flow rather than the file management portion of this scenario, as the upcoming adoption of Hyku will have a significant impact. For the purposes of the tracer bullet, we are working with digital objects already deposited into the SDR and described via Hydrus. The "depositor" (actually metadata staff) will describe the objects in CEDAR based on the records in SearchWorks, but will reformulate the metadata in CEDAR independently rather than doing a simple mapping from MODS. One possible step is to generate an operational MODS record for SDR use from the RDF description, but for the purposes of the tracer bullet this operational record will not actually be written to the repository.


Tracer Bullet 4: Deposit set of items to digital repository with RDF metadata

See overview presentations above for screenshots.

Tracer Bullet 2Original cataloging in BIBFRAMEflowchart(s)

Tracer Bullet 3


tool cedar to triple store

CEDAR templates

Deposit item to digital repository with RDF metadata; deposit by item creator3 figures
Tracer Bullet 4Deposit set of items to digital repository with RDF metadata

diagram

sample data and mappings?

Modeling authority data

  • Tool recommendations for conversion and original creation of linked data
  • Best practices for pre- and post-conversion enhancements
  • Linked data descriptions of set of Stanford library and digital repository resources


Team

Project Co-Managers

  • Philip Schreur, Assistant University Librarian for Technical and Access Services
  • Tom Cramer, Assistant University Librarian, Chief Technology Strategist, and Director of Digital Library Systems and Services

Acquisitions Department

  • Alexis Manheim, Head of Aquisitions Department
  • Linh Chang, Receiving and Access Librarian

Metadata Department

  • Nancy Lorimer, Head of Metadata Department
  • Joanna Dyla, Head of Medata Development Unit
  • Vitus Tang, Head of Data Control and E-resources Unit
  • Arcadia Falcone, Metadata Coordinator

Digital Library Systems and Services

  • Darsi Rueda, Head of Library Systems Department
  • Naomi Dushay, Digital Library Software Engineer
  • Joshua Greben, Library Systems Programmer / Analyst
  • Darren Weber, Digital Library Software Engineer
Completed Work

Analysis/Modeling

  • Mapped Stanford's vendor-supplied copy cataloging and original cataloging workflows
  • Mapped workflow for converting vendor-supplied records to linked data
  • Generated requirements for work-based discovery environment, to take advantage of RDF
  • Evaluated BIBFRAME profiles for original cataloging

Linked Data Creation

  • Worked with vendor on improvements to supplied MARC data to enhance conversion to BIBFRAME
  • Tracer Bullet 1: Converted set of 38,000 MARC records from Symphony to BIBFRAME using Library of Congress converter, loaded to Blazegraph triplestore, and indexed to Blacklight Solr environment via automated scripts
  • Tracer Bullet 2: Created original descriptions of 50 items with local instance of BIBFRAME 2.0 Editor
  • Tracer Bullet 3: Created original descriptions of about 30 digital assets using CEDAR RDF editor
  • Tracer Bullet 4:
  • Piloted automated pipeline approach for conversion of MARC records to BIBFRAME, loading to triplestore, and indexing to Solr

Discovery Environment Creation

  • Created Blacklight/Solr instance-based discovery environment with source data a mix of linked data and MARC data
  • Developed a mapping from BIBFRAME 2.0 to Solr document for book materials
  • Developed a mapping from RDF to Solr for digital assets

Tool Exploration / Requirements Definition

  • Gathered requirements for conversion and editing tools
  • Set up Registry of Tools
  • Evaluated CEDAR template creation and metadata editing tool
  • Developed a validation suite for MARC-to-RDF converters
  • Created local instance of Library of Congress BIBFRAME 2.0 Editor

April 2017 Update

  • No labels