Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Large Datasets and Out of Memory Errors

When using the synctool SyncTool to transmit datasets data sets with a large number of files (ie  greater than one million items)  users occassionally i.e. hundreds of thousands of files or more) users occasionally run into out of memory errors.   Users  Users with sufficient memory resources on their machines can usually remedy this problem by increasing the maximum heap space available to the JVMJava VM.  We recommend starting with a max heap space setting of at least  1.5 1 GB when working with sets of with approximately one million over 100,000 files.   If you're still running into issues,  try increasing by 500MB If the problem persists, try increasing the memory value until the problem ceases to manifest.  To To increase the heap space use the -Xmx java option.  Click for more information on setting the heap space.  .

An alternative solution is to upload files in smaller sets. The prefix option can be used to ensure that files are added to DuraCloud with the preferred ID values.

To run the SyncTool in UI mode with 1 GB of heap memory space, download the Jar version of the SyncTool and execute the following on the command line:

Code Block
java -Xmx1g 
Code Block
#!/bin/bash
 
#for 1GB 
java -Xmx1024m  -jar duracloudsync-{version}.jar [parameters]
#or 
java -Xmx1g  -jar ...
#for 2GB 
java -Xmx2048m  ...
#or 
java -Xmx2g  ...

To run the SyncTool in command-line mode with 1 GB of heap memory space, download the Jar version of the SyncTool and execute the above command followed by the command line parameter values. 

Prerequisites

Info

As of DuraCloud version 2.2.0, the Sync Tool requires Java 7 to run. As of version 3.3.0, DuraCloud is primarily tested using Java 8, and this is the recommended Java version for building and running DuraCloud tools. The latest version of Java can be downloaded from here.

...