Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
maxLevel3

VIVO's Triple Store Options

...

Ingest testing

This test is designed to measure the amount of time taken to ingest a standard data set. The data set used in this test is the published OpenVIVO content found in the vivo-project/sample-data GitHub repository.

...

  1. Total time for ingest determined by "grepping" for "ingest" in the vivo.all.log(s)
    1. There should be two lines, like the following:


      No Format
      2020-02-26 22:45:18,938 INFO  [RDFUploadController] Start ingest: 2020-02-27T03:45:18.937813Z
      2020-02-27 00:08:27,242 INFO  [RDFUploadController] Stop ingest: 2020-02-27T05:08:27.242238Z, total time: PT1H23M8.304425S


  2. Time for each method invoked on the RDFService implementation
    1. The attached script is run over a concatenation of all vivo.all.log files created during the ingest process
    2. The script produces a report of total times for each RDFService method, like the following:

      No Format
         calls      sec   sec/call               method
      ==================================================
          8502   483.01     0.0568      changeSetUpdate
       1389889  1895.31     0.0014 sparqlConstructQuery
          7056    16.72     0.0024    sparqlSelectQuery
          4261     3.52     0.0008       sparqlAskQuery
            14     0.04     0.0029    isEquivalentGraph
      Total time: 2398.603 sec (~39 mins, or ~0 hrs)


...

  1. Update file `$VIVO_HOME/config/developer.properties`, ensuring the following options are enabled/uncommented

    Code Block
    developer.enabled = true
    developer.loggingRDFService.enable = true
    developer.loggingRDFService.queryRestriction = .*
    developer.loggingRDFService.stackRestriction = .*
    


Test results

TDB

Run 1
  1. Total time: 12min 42sec

    No Format
    2020-02-26 21:53:03,478 INFO  [RDFUploadController] Start ingest: 2020-02-27T02:53:03.478638Z
    2020-02-26 22:05:46,016 INFO  [RDFUploadController] Stop ingest: 2020-02-27T03:05:46.015668Z, total time: PT12M42.53703S


  2. Method invocation times

    No Format
       calls      sec   sec/call               method
    ==================================================
        8502    48.97     0.0058      changeSetUpdate
     1406755   380.57     0.0003 sparqlConstructQuery
       10101    11.72     0.0012    sparqlSelectQuery
       12354     2.15     0.0002       sparqlAskQuery
          14     0.83     0.0592    isEquivalentGraph
    Total time: 444.245 sec (~7 mins, or ~0 hrs)


SDB

Run 1
  1. Total time: 1hr 23min 8sec

    No Format
    2020-02-26 22:45:18,938 INFO  [RDFUploadController] Start ingest: 2020-02-27T03:45:18.937813Z
    2020-02-27 00:08:27,242 INFO  [RDFUploadController] Stop ingest: 2020-02-27T05:08:27.242238Z, total time: PT1H23M8.304425S


  2. Method invocation times

    No Format
       calls      sec   sec/call               method
    ==================================================
        8502   483.01     0.0568      changeSetUpdate
     1389889  1895.31     0.0014 sparqlConstructQuery
        7056    16.72     0.0024    sparqlSelectQuery
        4261     3.52     0.0008       sparqlAskQuery
          14     0.04     0.0029    isEquivalentGraph
    Total time: 2398.603 sec (~39 mins, or ~0 hrs)



Fuseki (local, backed by TDB)

Run 1
  1. Total time: 1hr 11min 0sec

    No Format
    2020-02-27 20:58:05,486 INFO  [RDFUploadController] Start ingest: 2020-02-28T01:58:05.486176Z
    2020-02-27 22:09:05,833 INFO  [RDFUploadController] Stop ingest: 2020-02-28T03:09:05.829769Z, total time: PT1H11M0.343593S


  2. Method invocation times

    No Format
       calls      sec   sec/call               method
    ==================================================
        1302   176.50     0.1356      changeSetUpdate
     1387044  2697.63     0.0019 sparqlConstructQuery
        6791   107.68     0.0159    sparqlSelectQuery
        2868    13.65     0.0048       sparqlAskQuery
          14     0.86     0.0615    isEquivalentGraph
    Total time: 2996.323 sec (~49 mins, or ~0 hrs)


Read testing

This test is designed to measure the amount of time taken to read a fixed data set. The data used in this test is the published OpenVIVO content found in the vivo-project/sample-data GitHub repository, previously ingested into VIVO... and for this test, read by the VIVO Scholars application in the process of Scholars populating its dedicated Solr index.

Test procedure

The OpenVIVO test data is initially ingested into VIVO as described in the previous "Ingest testing" procedure. After ingest to VIVO, the VIVO Scholars application is started with a connection to the VIVO data store. During VIVO Scholar's start-up procedure, it reads content from VIVO's data store in order to populate its dedicated Solr index.

These "read tests" capture the timing of the time it takes VIVO Scholar to update its Solr index.