Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Looking at the whole data model at once can be overwhelming, so we'll consider it as a few smaller simpler systems:

These systems overlap around a few key classes, mainly DataSource, Identifier, LicensePool, and Work.

...

For the sake of simplicity, this document will talk about "books", but the rules are the same for audiobooks and other forms of content.

Anchor
Bibliographic metadata
Bibliographic metadata

Bibliographic metadata

Bibliographic information is information about books as opposed to the books themselves. A book's title, its cover image, and its ISBN are all bibliographic information--the text of the book is not. Bibliographic information flows into the circulation manager and metadata wrangler from a variety of sources, mainly OPDS feeds and proprietary APIs. We keep track of all this information and where it came from, and when necessary we weigh it, sort it, and boil it down into a small amount of information that can be used by other parts of the system.

...

Each Subject may be associated with a Genre. When LicensePools are turned into Works, all the related Classifications are gathered together. We then assign the Work to the Genre that showed up the most.

A Genre may also be associated with one or more Lanes -- this is the primary technique we use when choosing how to show books to patrons.


Measurement

A Measurement is a numeric value associated with an Identifier.

...

A Hyperlink represents a connection between an Identifier and a Resource.

It contains two extra pieces of information about the link:

  • A DataSource -- who provided this link?
  • rel -- what is the relationship between the Identifier and the Resource? "There's a link" is very vague; this is more specific. Different rel values are defined for a cover image, a thumbnail image, review, a description, a copy of the actual book, and so on.


Image Modified


Resource

A Resource represents a document found somewhere on the Internet -- probably either a cover image or a free book. It has a url, and that's basically it -- everything about the document itself is kept in Representation.

  • A Resource that's an image may be chosen by an Edition as the best available cover image for a given book.
  • A Resource that's a textual description may be chosen by a Work as the best description for a given book.
  • A LicensePoolDeliveryMechanism for an open-access book will point to a Resource that represents the book itself.

Image Modified


Representation

A Representation is a local cache of a Resource. It represents our attempt to actually download a Resource and records what happened when we tried.

If everything went well, the Representation will contain a file--binary, text, HTML, or image. Otherwise, the Representation will contain information about what went wrong -- maybe the server was down or something.

Circulation managers don't usually create Representations -- they rely on the metadata wrangler to do that.

An image Representation that's a thumbnail of another image Representation is connected to its original through .thumbnail_of.

Image Modified


Putting it all together

Here's how the whole subsystem works together. Let's say one of our data sources that claims the URL http://example.org/covers/my-book.png is a cover image for the ISBN "97812345678". We want to represent this fact in our system.

  1. We'd create an Identifier for the ISBN "97812345678".
  2. We'd create a Resource for http://example.org/covers/my-book.png
  3. We'd create a Hyperlink with the rel "http://opds-spec.org/image", for "cover image". The .data_source of this Hyperlink would be set to the DataSource that made the original claim.
  4. We don't have to actually download http://example.org/covers/my-book.png, but if we do decide to download it, the binary image will be stored as a Representation. If there's a problem and we can't complete the download, that fact will be stored in the Representation instead.
  5. If we download the image and everything goes well, we may also decide to create a thumbnail out of it. This would be stored as a second Representation, and its .thumbnail_of would point to the original, full-size Representation.

ResourceTransformation

A ResourceTransformation represents a change that was made to one Resource to generate another Resource.

Currently it's used in the circulation manager's "cover image upload" feature. You can upload a background image (the original Resource) and paste the title and author onto it (a ResourceTransformation which results in a second Resource).

Theoretically, thumbnailing could also be handled as a ResourceTransformation, but it's probably not worth making this change.

Image Modified

Anchor
Licensing
Licensing

Licensing

Collection

A Collection represents a set of books that are made available through one set of credentials.

...

  • is associated with an Identifier, representing how the vendor identifies the book.
  • is associated with a DataSource, representing the vendor who provides the book.
  • belongs to one Collection.
  • has one presentation edition, containing the most complete set of metadata available for the book.
  • can have many Loans, Holds, Annotations, and Complaints
  • can have many CirculationEvents.
  • should have at least one DeliveryMechanism, through LicensePoolDeliveryMechanism.
  • has a RightsStatus, through LicensePoolDeliveryMechanism.

DeliveryMechanism

...

 and LicensePoolDeliveryMechanism

A DeliveryMechanism describes what format a book is actually available in. There are two parts to a DeliveryMechanism: 1) the DRM scheme implemented by the distributor, if any, and 2) the format of the book (EPUB, PDF, audiobook manifest, Kindle, and so on).

LicencePoolDeliveryMechanism is a three-way join table: a record of a promise by a vendor (identified by a DataSource) to deliver copies of a book (identified by an Identifier) in a specific format (identified by a DeliveryMechanism).

RightsStatus

A RightsStatus represents the terms under which a book is being made available to patrons. The most common varieties of RightsStatus are 1) in copyright, 2) public domain, and 3) a Creative Commons license. "In copyright" implies that a book is being made available to patrons by virtue of a licensing agreement between the library and the vendor. The other RightsStatus values imply that a book is being made available to library patrons on the same terms as it would be to the general public.

Complaint

Patrons may lodge one or more Complaints against a specific LicensePool. Complaints indicate problems with specific books. For example, a Patron can lodge a Complaint stating that a book is incorrectly categorized or described, or that there is a problem with checking it out, reading, or returning it.

CirculationEvent

A CirculationEvent is a record of something happening to a LicensePool. A CirculationEvent happens when an event takes place within the circulation manager (e.g. a work is checked out or placed on hold), or when we notice that an event happened on the distributor's side (such as licenses for a book being added or removed), or when a client app (i.e. a book having been opened).

...

Anchor
Site configuration
Site configuration
Site configuration

ExternalIntegration

A ConfigurationSetting holds information about an extra piece of site configuration. A ConfigurationSetting may be associated with an ExternalIntegration, a Library, both, or neither.

ConfigurationSetting

An ExternalIntegration contains the configuration for connecting to a third-party API. Commonly used third-party APIs include the metadata wrangler, DataSources that require protocols, authentication services, storage services, and search providers.

Anchor
Background processes
Background processes

Background processes

  • A Timestamp provides a record of when a Monitor was run.
  • A CoverageRecord provides a record of any processes that have been performed on a book (referred to via its Identifier)
  • A WorkCoverageRecord provides a record of any processes that have been performed on a Work (similar to what CoverageRecord does for Identifiers).

...