Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

LC's HTML results and ATOM results (either as XML or JSON) are actually serializations of the output from LC's internal search API.  There is no daylight, therefore, between the types of searches that can be performed by a human using the HTML UI or by a machine making requests of the service.  LC Search supports generic keyword searching, with no limits of any kind, but only labels and a few other key data points are actually searched.  Notes, for example, are not searchable.  It supports limiting searching to specific datasets and date ranges (undocumented, which is an oversight).  It supports targeted searches for specific labels, tokens (such as LCCNs), or codes.  It supports wildcards, boolean keywords, and negation.  Results are sorted by relevancy and, depending on the type of search, results may be weighted to preference "higher value" resources, such as Names, Subjects, or BF Resources, versus resources that play a supporting role in library data, such as regional encodings or broadcast standards.   With the use of key search limits/filters and wildcarding, it is theoretically possible to make the generic search service closely mimic the LC Suggest API, at least in terms of results (not sure if the sort order can be manipulated easily).   The Search API's output is intentionally generic.

LC Suggest API

LC's Suggest API has received minor enhancements over the years, but is largely the same as implemented back in 2009-2010, when the service was still hosted at lcsh.info.  Its output format is identical.  It is formally undocumented (and how it has managed this long without being formally documented is a bit of a mystery) but LC readily and willingly shares information about not only its existence but how it works.  LC's Suggest API is a left-anchored, wild-carded search of preferred labels, codes (for example a search for "deu" will return "German" from the ISO-6392 dataset), or tokens (such as "sh12345," which is an LCCN).  Results are ordered alphabetically (based on codepoint).  There is some support for limiting based on MADS/RDF type (PersonalName, Topic, Geographic, etc.).  Notably, it does not search variant/alternate labels.  Under the hood, the code is 100% distinct from the code that runs the Search API, mostly so it can better focus on its purpose, which is to return typeahead suggestions as quickly as possible.  The service's output is outdated, a form of JSON that is difficult to work with given today's typeaheads.  It is also limited to only the URI and label of the hit.