entity resolution (strings to things)

UNEDITED NOTES

DPLA: placename resolution

Entity recognition

use entire record as context for resolution
points vs. shapes in geo entity resolution
crowdsourcing opportunity?
OCLC - several passes through data, information from multiple sources (ISNI, VIAF, etc.)
need public feedback for last 20%
refine algorithms based on crowdsourcing feedback
machine transformation and confidence rating – mark that is machine-generated, with date

strings --> things

libraries divide and conquer entity cataloging

post-processing tools

accuracy tools

entity extraction

how motivate users to take tools/data for a spin?

what if we had no metadata and started only with full text?

challenges

parsing MARC to find translaters and role

person reconciliation

crowd sourcing

music parsing

image identity

UCSD – mix of auto & manual review

CERL – name, spelling & disambiguation

HBS – URIs provided by authority vendor

Create local auth record/URI for strings with no auth?

Feed into LC or OCLC for needed authorities?

Improve cataloging tools with type-ahead entity resolution

Page tree