Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

 

 

Info

(question) TODO: How does an admin determine which of the files listed in the directory above map to Fedora nodes? In other words, what is the algorithm for generating these filenames?

(question) TODO: The pasted text in boxes below need detailed description. They are not fully "self-evident".

 

 

 

Walkthrough

The following steps simulate a typical user session. An end result (i.e a layout of file and directories) is then shown.

...

These generated files contain serialized data about each of the JCR/Fedora nodes. Running the cURL command above, for example, generates binary files such as the 2 pasted below. After some editing for readability, the object and datastream text is highlighted in bold to show data of interest. The grayed out text refers to Modeshape classes responsible for data representation and serialization. 

(question) How does an admin determine which of the files listed in the directory above map to Fedora nodes? In other words, what is the algorithm for generating these filenames?

(question) The pasted text in boxes below need detailed description. They are not fully "self-evident".

. The file names in the case of FileCacheStore are generated based on the hash of the node's UUID. For example, for root node, file -8.. gets generated, containing the node data. Although the generated file itself is binary, the data it contains can be read using a tool or ModeShape API. For example, for our root node, the data is:

{ "properties" : { "http://www.jcp.org/jcr/1.0" : { "primaryType" : { "$name" : "mode:root" } , "uuid" : "87a0a8c7505d64/" } } , "children" : [ { "key" : "87a0a8c317f1e7jcr:system" , "name" : "jcr:system" } , { "key" : "87a0a8c7505d646988a17a-ad2f-48f6-aef5-e7d411e184d9" , "name" : "chandni" } ] , "childrenInfo" : { "count" : 2 } }

The actual binary looks something like this87a0a8c7505d64 refers to root node UUID:

87a0a8c7505d64/7org.infinispan.schematic.internal.SchematicEntryLiteral6org.infinispan.marshall.jboss.JBossExternalizerAdapterq externalizer$org.infinispan.marshall.ExternalizerDorg.infinispan.schematic.internal.SchematicEntryLiteral$Externalizer7org.infinispan.schematic.internal.SchematicExternalizer8org.infinispan.schematic.internal.document.BasicDocument;?org.infinispan.schematic.internal.document.DocumentExternalizer;23

...

metadata?id87a0a8c7505d64/contentTypeapplication/jsoncontent>propertiesghttp://www.jcp.org/jcr/1.0FprimaryType$name

...

mode:rootuuid

 87a0a8c7505d64/children0<key87a0a8c317f1e7jcr:systemnamejcr:system1Skey3

...

87a0a8c7505d64

53f49a07-8e14-41a1-bab3-abc59d86846enamechandni

 

After some editing for readability, the object and datastream text is highlighted in bold to show data of interest. The grayed out text refers to Modeshape classes responsible for data representation and serialization. Description of strings in root node file is this . . . 

 

87a0a8c7505d64 refers to root node UUID

Similary, for the datastream:

 

87a0a8c7505d64

8e6504dd-1fb8-4011-8d38-4fcd2e46c0f77org.infinispan.schematic.internal.SchematicEntryLiteral6org.infinispan.marshall.jboss.JBossExternalizerAdapterq externalizer$org.infinispan.marshall.ExternalizerDorg.infinispan.schematic.internal.SchematicEntryLiteral$Externalizer7org.infinispan.schematic.internal.SchematicExternalizer8org.infinispan.schematic.internal.document.BasicDocument; ?org.infinispan.schematic.internal.document.DocumentExternalizer;23metadatabid387a0a8c7505d64

8e6504dd-1fb8-4011-8d38-4fcd2e46c0f7

contentTypeapplication/jsoncontentkey

87a0a8c7505d64

8e6504dd-1fb8-4011-8d38-4fcd2e46c0f7

parent387a0a8c7505d64

46c11a1e-b2d9-496c-a950-5bd8cf7f2096

propertieshttp://www.jcp.org/jcr/1.0+primaryType$name nt:resourcedatachandni o meri chandni

lastModified.$date2013-12-05T18:52:00.714-05:00

mixinTypesL0D$name4{http://fedora.info/definitions/v4/rest-api#}binarylastModifiedBy bypassAdminmimeTypeapplication/octet-streamhttp://fedora.info/definitions/v4/rest-api

#NdigestA$uri2urn:sha1:1c63b638cab226a394ea27819c18397dd96687fchttp://www.loc.gov/premis/rdf/v1#hasSize55

...

As with the LevelDB option, when Fedora 4 is started with FileCacheStore configuration, ModeShape creates several directories on the filesystem: 

  1. data
  2. expired
  3. index files

 

Specifying the FileCacheStore option would result in creating hundreds of binary files in that data directory (e.g. 11333332.. , -2334002.. etc).

Using the Infinispan 6.x deprecated FileCacheStore (specified via file/infinispan.xml, currently our ModeShape is on 5.x) results in creation of hundreds of binary files (compared to LevelDB). A hashing algorithm is used to map keys to buckets. The value files contain serialized ModeShape nodes. The key files can be read using org.infinispan.schematic.internal.document.BsonReader. (It does not seem possible to read these files using existing bson tools, like mongoDB bsondump, but further inspection is needed.)