You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Techniques and tools for name disambiguation and entity resolution

Tools already actively in use in the VIVO community

Harvester

URITool

Open Refine

 

Additional tools that may prove useful

from http://rawpatentdata.blogspot.com/2013/01/datamining-and-entity-resolutions-some.html

Name Cleaver

Name Cleaver (http://sunlightlabs.com/blog/2011/name-standardization-name-cleaver/) supports three major name types, politicians, individuals and organizations, with a specific class and special features for each.
The OrganizationNameCleaver class has methods to reduce a name to only the "kernel" of the name, and also to expand all abbreviations (that Name Cleaver knows of), useful for matching tasks.
The pyton code of the program can be downloaded here:  https://github.com/sunlightlabs/name-cleaver

OYSTER entity resolution

OYSTER (Open sYSTem Entity Resolution) is an entity resolution system that supports probabilistic direct matching, transitive linking, and asserted linking. To facilitate prospecting for match candidates (blocking), the system builds and maintains an in-memory index of attribute values to identities. Because OYSTER has an identity management system, it also supports persistent identity identifiers. OYSTER is unique among other ER systems in that it is built to incorporate Entity Identity Information Management (EIIM). OYSTER supports EIIM by providing methods that enforce identifiers to be unique among identities, maintain persistent IDs over the life of an identity, and allowing the ability to fix false-positive and false-negative resolutions, which cannot be done with matching rules, through the use of assertion, traceability, and other features.
Developed in JAVA, can be downloaded from:  http://sourceforge.net/projects/oysterer/

Autotagging

  • Open Calais
  • Agrotagger

 

  • No labels