...
Looking at the whole data model at once can be overwhelming, so we'll consider it as a few smaller simpler systems:
- Bibliographic metadata187176996
- Licensing187176996
- Works187176996
- Custom lists187176996
- Libraries187176996
- Patrons
- Site configuration187176996
- Background processes187176996
These systems overlap around a few key classes, mainly DataSource
, Identifier
, LicensePool
, and Work
.
The code for the data model is in the model
package of the server_core
project.
...
Bibliographic information is information about books as opposed to the books themselves. A book's title, its cover image, and its ISBN are all bibliographic information--the text of the book is not. Bibliographic information flows into the circulation manager and metadata wrangler from a variety of sources, mainly OPDS feeds and proprietary APIs. We keep track of all this information and where it came from, and when necessary we weigh it, sort it, and boil it down into a small amount of information that can be used by other parts of the system.
DataSource
A
Some examples of
A
|
Identifier
An An
|
Equivalency
An
|
Edition
An An
|
The contributor subsystem
This system basically tracks who wrote which book. There are two classes in this subsystem: Contributor
and Contribution
.
Contributor
A Contributor
A Contributor
is a human being or a corporate entity who is credited with work on some Edition
some Edition. The credit itself is kept in a a Contribution
, which ties a Contributor
to an a Contributor
to an Edition
.
A Contains basic biographical information about a person or corporation. Most notably, it has both a |
Contribution
A Contribution
:
A Contribution
is piece of information contributed
|
The classification subsystem
This system tracks how a book might be classified in a card catalog or shelved in a bookstore. There are two classes in this subsystem: Subject
and Classification
.
Subject
A A Subject
represents represents a classification that someone might give a book. Subject
handles Subject handles a variety of classification schemes: Dewey Decimal, LLC, LCSH, BISAC, proprietary systems like Overdrive's, and free-form tags, among others.
Four pieces of information might be derived from the
|
Classification
A Classification
is someone's opinion that a book should be filed under a certain Subject
.
A
|
Genre
There are many different data sources which use many different classification schemes for the same books. Rather than expose this chaos to patrons, we have defined about 150 Genre
s, corresponding to the sections of a large bookstore or branch library: "Romance", "Biography", and so on.
Each A |
Measurement
A Measurement
is a numeric value associated with an Identifier
. It represents some quality that distinguishes one book from others. The most useful measurements are popularity (a popular book is read/accessed/purchased/accessioned more often) and rating (a highly rated book is considered to be of high quality).
The linked resources subsystem
This system keeps track of external resources associated with a book. An "external resource" can be pretty much anything, but these are the most common types of resources we track:
- A cover image
- A thumbnailed version of a preexisting cover image
- A textual description
- An EPUB copy of a free book
- A review
Hyperlink
A Hyperlink
represents a connection between an Identifier
and a Resource
. It contains two extra pieces of information about the link:
- A
DataSource
-- who provided this link? rel
-- what is the relationship between theIdentifier
and theResource
? "There's a link" is very vague; this is more specific. Differentrel
values are defined for a cover image, a thumbnail image, review, a description, a copy of the actual book, and so on.
Resource
A Resource
represents a document found somewhere on the Internet -- probably either a cover image or a free book. It has a url
, and that's basically it -- everything about the document itself is kept in Representation
.
- A
Resource
that's an image may be chosen by anEdition
as the best available cover image for a given book. - A
Resource
that's a textual description may be chosen by aWork
as the best description for a given book. - A
LicensePoolDeliveryMechanism
for an open-access book will point to aResource
that represents the book itself.
Representation
A Representation
is a local cache of a Resource
. It represents our attempt to actually download a Resource
and records what happened when we tried. If everything went well, the Representation
will contain a file--binary, text, HTML, or image. Otherwise, the Representation
will contain information about what went wrong -- maybe the server was down or something.
...
Anchor | ||||
---|---|---|---|---|
|
Collection
A Collection
represents a set of books that are made available through one set of credentials.
...
The books themselves are stored as LicensePool
s, and the credentials are stored in an ExternalIntegration
.
LicensePool
A LicensePool represents an agreement on the part of a book vendor to actually deliver a book to a patron.
...
A Work represents a book in general, as opposed to one specific edition of a book, or a specific licensing agreement to deliver copies of a book.A Work:
About Works | DB Schema |
---|---|
May have copies scattered across many LicensePools | |
May have many Editions, but derives its presentation metadata from one particular Edition, which is known as its “presentation edition.” This special | |
Stores information that has been aggregated from multiple sources and summarized:
| |
May be referenced by multiple | |
May participate in many |
Anchor | ||||
---|---|---|---|---|
|
A CustomList is a list of books, typically grouped by a criterion such as genre, subject, bestseller status, etc., which a librarian has compiled in the admin interface. Each CustomList is associated with, and presented to patrons in the front-end as, one Lane. A CustomList has at least one CustomListEntry, each of which refers to a particular Work.
Anchor | ||||
---|---|---|---|---|
|
...