The following steps simulate a typical user session. An end result (i.e a layout of file and directories) is then shown.
The user creates a node ('greetings_en') through the UI and uploads content – in this example it's a simple text file (fcrepo4_greetings.txt) with a string ("hello, world!"):
The default modeshape configuration specifies using a LevelDB backend store.
Fedora will create a directory "fcrepo4-data" in the current working directory. The default directories found in "fcrepo4-data" will be the following:
"fcrepo.ispn.repo.cache" contains the repository metadata as well as the binary content files that are smaller than 4Kb.
Inspecting Generated Data Files
The serialized Fedora nodes can be found in the "fcrepo.ispn.repo.cache/dataFedoraRepository" directory. The files in that directory would look something like this:
These are the LevelDB cache store files. The generated files contain serialized data about each of the JCR/Fedora nodes. Modeshape depends on Infinispan distributed cache store which in turn uses LevelDB to persist the data to filesystem. To know more about the leveldb files, see: https://leveldb.googlecode.com/svn/trunk/doc/impl.html.
Some key things to note about the files and their contents:
The serialization is done by the JBoss serialization library, not JDK's native object serialization machinery. For this reason the generated serialized data look different from an ordinary JDK serialized file. JBoss Marshalling can be configured to use custom serialization classes that read and write content in the format of the repository's choosing.
- The data is encoded in Binary JSON (BSON). If the file containing the root node (referencing the node 'greetings_en') is opened up in a hex editor, you would see /u0002 preceding strings (such as "name","key"); /u0004 preceding an array (representing sub-nodes); /u0003 representing the UUID -- in accordance with the BSON spec.
The binary datastreams larger than 4Kb are stored in the "fcrepo.binary.directory".
Add larger content:
After uploading a binary file (>4Kb), the default directories found in "fcrepo4-data" will be the following:
Inspecting the content of the binary directory:
The contents of these files will match the uploaded file and its content-type:
Inspecting ObjectStore Folders
Directories "com.arjuna.ats.arjuna.objectstore.objectStoreDir" and "com.arjuna.ats.arjuna.common.ObjectStoreEnvironmentBean.default.objectStoreDir" are JBoss JTA transaction engine artifacts. The default Fedora Infinispan configuration attempts to find a JBossJTA transaction manager implementation via "org.infinispan.transaction.lookup.GenericTransactionManagerLookup". This configuration uses Arjuna ShadowFileStore as a backend, resulting in several directories within fcrepo4-data such as "object-store" and "object-store-default":
A detailed description of the artifacts maintained by the JBossJTA implementation is most likely beyond the scope of this document (at least for now).
Infinispan Configuration Options
Depending on the configured Infinispan backend, the directory layout and contents of the binary files would be different. The following sections covers other cache store options.
Currently, the default configuration outputs Fedora data to LevelDB (a fast filesystem based key-value store). When Fedora 4 is started, ModeShape (actually Infinispan and LevelDB in the background) will create several directories on the filesystem. Currently, the directories created are:
- fcrepo.ispn.binary.cache (binary data)
- fcrepo.ispn.cache (metadata)
- fcrepo.ispn.repo.cache (repository)
The layout of files in directories 1-3 is determined by LevelDB. Some of the important files are:
- File .log holds entries for recent transactions. The relevant API for representing these entries is modeshape-schematics (see, e.g., org.infinispan.schematic.SchematicEntry)
- File .sst stores these entries when the .log file reaches a size threshold. A new log file is generated.
- File MANIFEST.x records info about .sst files (among other things).
- File CURRENT specifies the current MANIFEST file.