Table of Contents
This page explores design issues for an efficient ORE ontology gem implementation based off the ActiveTriples framework.
The following is a list of all ontologies used by the Triple Examples.
Ontology Name | Prefix | URL | Details | Comments |
---|---|---|---|---|
RDF | rdf | http://www.w3.org/1999/02/22-rdf-syntax-ns# | specification | |
RDF Schema | rdfs | http://www.w3.org/2000/01/rdf-schema# | specification | |
Dublin Core | dc | http://purl.org/dc/elements/1.1/ | specification | |
ORE | ore | http://www.openarchives.org/ore/terms/ | specification | Represents both ordered and unordered items using the Aggregation class. |
IANA | iana | http://www.iana.org/assignments/relation/ | specification | ORE uses this ontology for first, last, next, and prev predicates. |
Friend of a Friend | foaf | http://xmlns.com/foaf/0.1 | specification | Uses ld4l-foaf_rdf gem. |
@prefix ore: <http://www.openarchives.org/ore/terms/> . @prefix iana: <http://www.iana.org/assignments/relation/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . <http://localhost:3000/individual/vc155> a ore:Aggregation ; ore:aggregates <http://da-rdf.library.cornell.edu/individual/b3652730> ; ore:aggregates <http://da-rdf.library.cornell.edu/individual/b3652234> ; ore:aggregates <http://da-rdf.library.cornell.edu/individual/b3652543> ; iana:first <http://localhost:3000/individual/vci162> ; iana:last <http://localhost:3000/individual/vci164> ; dc:title "My list" ; dc:description "This is my list of references." . <http://localhost:3000/individual/vci162> a ore:Proxy ; ore:proxyFor <http://da-rdf.library.cornell.edu/individual/b3652730> ; ore:proxyIn <http://localhost:3000/individual/vc155> ; iana:next <http://localhost:3000/individual/vci163> . <http://localhost:3000/individual/vci163> a ore:Proxy ; ore:proxyFor <http://da-rdf.library.cornell.edu/individual/b3652234> ; ore:proxyIn <http://localhost:3000/individual/vc155> ; iana:prev <http://localhost:3000/individual/vci162> ; iana:next <http://localhost:3000/individual/vci164> . <http://localhost:3000/individual/vci164> a ore:Proxy ; ore:proxyFor <http://da-rdf.library.cornell.edu/individual/b3652543> ; ore:proxyIn <http://localhost:3000/individual/vc155> ; iana:prev <http://localhost:3000/individual/vci163> . |
The ORE implementation in the triple store will maintain a doubly linked list using IANA ontologies first and last predicates on the list and next and prev on the proxy. Once loaded into memory, the ORE triples will be converted into a List Header Info hash data structure with an array of items each holding a List Item Info hash data structure. Testing was executed to analyze the efficiency of working with arrays of List Item Info hash data structures.
Tests descriptions:
Environments for testing:
Target production system:
Test code: https://gist.github.com/elrayle/f5f559f8c10243600dc6
Results:
Max Items | array_create | array_move | list_create | list_move | list_insert | list_append | list_find_last | list_find_random | Comments | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Laptop | DEV VM | Laptop | DEV VM | Laptop (16Gb) | DEV VM (2Gb) | Laptop | DEV VM | Laptop | DEV VM | Laptop | DEV VM | Laptop | DEV VM | Laptop | DEV VM | ||
1,000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
10,000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
100,000 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
500,000 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
1,000,000 | 0 | 0 | 0 | 0 | 4 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
2,000,000 | 0 | 0 | 0 | 0 | 11 | 13 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
4,000,000 | 0 | 0 | 0 | 0 | 35 | OOM | 0 | OOM | 0 | OOM | 0 | OOM | 1 | OOM | 0 | OOM | |
8,000,000 | 1 | 1 | 0 | 0 | 115 | 0 | 0 | 0 | 2 | 0 | |||||||
10,000,000 | 1 | 1 | 0 | 0 | 167 | 0 | 0 | 0 | 2 | 0 | |||||||
20,000,000 | 1 | 2 | 0 | 0 | OOM | OOM | OOM | OOM | OOM | OOM | |||||||
40,000,000 | 4 | 4 | 0 | 0 | |||||||||||||
80,000,000 | 6 | 8 | 0 | 0 | |||||||||||||
100,000,000 | 8 | 11 | 1 | 1 |
* Time measured in seconds
* OOM - Out of Memory
Converting triples data into an array of items is constrained by memory and the execution time of retrieving data from the triple store and constructing the list. A modest 2Gb of RAM will support holding all data up to approximately 100,000 items and maintain processing efficiency. Implementations supporting a list of greater than 100,000 items or many concurrent lists of smaller item counts will likely require a more sophisticated approach to retrieving and storing items in-memory.
User is browsing the list starting from the beginning, displaying N items at a time with the ability to move forward and back through the display of N items.
Display information is not part of ORE triples. For example, proxyFor is a URI to a book which has a DC.title, DC.description, etc. Would like to be able to add this information to each ListItemInfo structure and pass that to the UI code for display.
User is browsing the list and drags an item to a new position in the list.
User is browsing the catalog and decides to add one or more bibliographic references to the list. Add items defaults to append to end of list.
User chooses to sort list items by a display value that is not stored as part of the triples associated with the ORE list triples. For example, the list is aggregating bibliographic references. The properties of the bibliographic references may be stored in a separate triplestore. Example sort criteria are title, publication date,
The list header info will be a hash. It holds information about the list as a whole and the currently loaded range of items as a sub-structure. The values for the list information stored depends on the type of the loaded items sub-structure. The loaded items sub-structure holds items as a group are in-memory using one of four possible grouping methods: doubly linked lists, doubly linked lists + order index, array, or hash with proxy URI as the key. NOTE: Order index is global to the entire list. The array and hash implementations may also have this global order index information stored with the item info.
example value types | Comments | |||||
---|---|---|---|---|---|---|
key | description of value | doubly linked | doubly linked + order | array | hash | |
first | pointer to first item in list | URI of first proxy | URI of first proxy | URI of first proxy | URI of first proxy | |
last | pointer to last item in list | URI of last proxy | URI of last proxy | URI of last proxy | URI of last proxy | |
count | count of all items in list | X | last.index | last.index | last.index | Count for full list is available for array and hash when order index is stored in triplestore. |
first_loaded | pointer to first item in loaded range of items | item | item | array[0] | URI of first loaded proxy | |
current | pointer to current item | item | item | int cpos | URI of current proxy |
|
last_loaded | pointer to last item in loaded range of items | item | item | array[size of array] | URI of last loaded proxy | |
count_loaded | count of items in currently loaded range | X | =position of last - position of first | size of array | size of hash | |
resume_next_token | pointer to item after last_loaded | URI of proxy | URI of proxy | URI of proxy | URI of proxy | |
resume_prev_token | pointer to item just before first_loaded | URI of proxy | URI of proxy | URI of proxy | URI of proxy | |
loaded_items | pointer to loaded items | use first_loaded | use first_loaded | array | hash |
Each item's info is stored in a hash. It holds information about the individual item. The loaded items as a group are in-memory using one of four possible grouping methods: doubly linked lists, doubly linked lists + order index, array, or hash with proxy URI as the key.
example value types | ||||||
---|---|---|---|---|---|---|
key | description of value | doubly linked | doubly linked + order | array | hash | Comments |
uri | URI of proxy for this item | URI of this proxy | URI of this proxy | URI of this proxy | URI of this proxy | This is the hash key for loaded_items when the list is implemented as a hash. |
prev | pointer to previous item in list | URI of prev proxy | URI of prev proxy | = p - 1 | URI of prev proxy | |
next | pointer to next item in list | URI of next proxy | URI of next proxy | = p + 1 | URI of next proxy | |
prev_loaded | pointer to previous item in loaded range of items | prev item | prev item | = pl - 1 | URI of prev proxy | |
next_loaded | pointer to next item in loaded range of items | next item | next item | = pl + 1 | URI of next proxy | |
proxyFor | URI of list item being aggregated | URI | URI | URI | URI | |
proxyIn | URI of list aggregation | URI of aggregation | URI of aggregation | URI of aggregation | URI of aggregation | |
proxyIn_loaded | pointer to List Header | header | header | header | header | |
position (p) | position in full list | X | p | p | p | Position in full list is available for array and hash when order index is stored in triplestore. |
position_loaded (pl) | position in loaded range of items | X | pl | pl | X |