OutOfMemoryException when ingesting large files
Currently there seems to be a bug, which creates OutOfMemoryExceptions when ingesting files that are larger than available heap space with certain infinispan configurations (e.g. LevelDB). It seems like this is an issue with the Modeshape project which has been reported at: https://issues.jboss.org/browse/MODE-2103
The following TestCase can be used to reproduce the issue: https://github.com/futures/large-files-test
Workaround
You will need a large heap size for this to work (e.g. -Xmx2048g)
Currently the only known workaround is using a _file_ configuration for infinspan caches e.g.: https://github.com/futures/fcrepo4/blob/34aab66bc26edfca3a4cbabecc4870bfd81f05da/fcrepo-http-commons/src/main/resources/config/single-file/repository.json.
This can be done by setting the following property:
-Dfcrepo.modeshape.configuration=config/single-file/repository.json
Large Files on a Single Node Fedora 4 Installation
Because of https://issues.jboss.org/browse/MODE-2103 for large file ingests only the single-file configuration can be used.
Setting the Java Property fcrepo.modeshape.configuration to classpath:/config/single-file/repository.json and allowing the heap to grow up to 2gb is required.
Example
Running Fedora 4 for large file ingests in Tomcat7
CATALINA_OPTS="-Xmx1024m -XX:MaxPermSize=256m -Dfcrepo.modeshape.configuration=classpath:/config/single-file/repository.json" bin/catalina.sh run |
Using the single-file configuration ingest and retrieval of files up to the size of 300 GB using Fedora 4's REST API were tested successfully. The files were ingested sequentially, retrieved and a bitwise comparison with the original data has been performed. Larger sizes have not been tested, due to HDD size limitations.
Large File Upload/Download Roundtrip via REST API Tests
- Platform: lib-devsandbox1.ucsd.edu (all data on NAS to handle large files)
- Repository Profile: Minimal
- Workflow Profile: Upload/Download Roundtrip
File Size | Upload | Download |
---|---|---|
256GB | 15,488,156ms (16.9MB/sec) | 3,306,756ms (79.3MB/sec) |
512GB | 31,262,610ms (16.77MB/sec) | 5,386,542ms (97.33MB/sec) |
1TB | 59,631,142ms (17.58MB/sec) | 15,120,135ms (69.35MB/sec) |
- Conclusion: Arbitrarily-large files can be ingested via the REST API. The only apparent limitations are disk space available to store the file, and sufficiently large Java heap size (tested with -Xmx2048m).
Large File Upload/Download Roundtrip Tests
- Platform: Linux 3.12.1-1-ARCH #1 SMP PREEMPT x86_64 GNU/Linux 16GB RAM
- Repository Profile: Single-File
- Workflow Profile: Upload/Download Roundtrip
File Size | Upload | Download |
---|---|---|
256GB | 15,488,156ms (16.9MB/sec) | 3,306,756ms (79.3MB/sec) |
512GB |
Federated Content Large File Download Roundtrip Tests
- Platform: Linux 3.12.1-1-ARCH #1 SMP PREEMPT x86_64 GNU/Linux 16GB RAM
Repository Profile: Single-File with an additional external Resource:
"externalSources" : {
"home-directory" : {
"classname" : "org.modeshape.connector.filesystem.FileSystemConnector",
"directoryPath" : "/tmp/projection",
"projections" : [ "default:/projection => /" ],
"readOnly" : true,
"addMimeTypeMixin" : true
}
}
File Size | Projection Directory Request Duration | First Projected Node Request Duration | Download Duration | Throughput |
---|---|---|---|---|
2 GB | 0m35.117s | 0m34.572s | 0m8.236s | 248.66 mb/sec |
10 GB | ||||
100 GB |
| |||
300 GB | ||||
10*10 GB |
- Platform: lib-devsandbox1.ucsd.edu (all data on NAS to handle large files)
- Repository Profile: Minimal, with filesystem federation:
"externalSources" : { "filesystem" : { "classname" : "org.modeshape.connector.filesystem.FileSystemConnect "directoryPath" : "/mnt/isilon/fedora-dev/federated", "projections" : [ "default:/projection => /" ], "readOnly" : true, "addMimeTypeMixin" : true, "contentBasedSha1" : "false" } }
Objects | Datastream Size | Projection Directory | Projected Node Request Duration | Download | Download Throughput |
---|---|---|---|---|---|
1 | 1 GB | 417 ms | 35 ms | 17,333 ms | 59.08 MB/sec |
1 | 2 GB | 528 ms | 219 ms | 26,902 ms | 76.13 MB/sec |
1 | 4 GB | 432 ms | 54 ms | 47,581 ms | 86.08 MB/sec |
1 | 8 GB | 583 ms | 90 ms | 90,705 ms | 90.31 MB/sec |
1 | 16 GB | 691 ms | 452 ms | 176,508 ms | 92.82 MB/sec |
1 | 32 GB | 445 ms | 34 ms | 348,488 ms | 94.03 MB/sec |
1 | 64 GB | 750 ms | 460 ms | 699,937 ms | 93.63 MB/sec |
1 | 128 GB | 800 ms | 90 ms | 1,412,640 ms | 92.79 MB/sec |
1 | 256 GB | 530 ms | 70 ms | 2,768,570 ms | 94.69 MB/sec |
1 | 512 GB | 490 ms | 80 ms | 5,893,420 ms | 88.96 MB/sec |
1 | 1 TB | 420 ms | 40 ms | 11,322,330 ms | 92.61 MB/sec |
Related articles
There is no content with the specified labels