Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Most of these comments originate from discussion between Richard Rodgers, Bill Hays and Tim Donohue on 15 April 2010. Feel free to enhance or add your own notes.

DuraCloud Synchronization

How does the "DuraCloud Sync Tool" (which watches a file system folder and synchronizes with the cloud) actually work?

In current scenario, you need to export all of DSpace into AIPs in a local file system folder, and tell the DuraCloud Sync Tool to synchronize that folder into the cloud. This is inefficient as it requires you to replicate all your content locally (to the sync folder) before it can be replicated to the cloud via DuraCloud.

  1. Does DuraCloud automatically synchronize all changes in the local folder? For instance, if a file is deleted, is it removed from the cloud storage?
    • Tim's Answer I talked to Bill Branan from DuraCloud team. The current implementation of the Sync Tool always synchronizes with local folder contents. So, if you delete a file from that folder, it will be removed from the cloud storage. However, after our discussions of use cases, Bill agreed it may be necessary to have a way to "turn off" the auto-delete functionality. So, that if you remove a file locally, it will not auto-delete it from the cloud (unless you explicitly force the delete).
  2. Would it be possible to perform a "trickle" synchronization for large amounts of content? For example, if your DSpace has 1TB of content, you wouldn't want to export the entire 1TB at once locally (thus doubling your local storage needs). Rather, maybe it would be possible to export 10GB at a time to a local DuraCloud Sync Folder, and have that content "trickle" up into the cloud.
    • Tim's Answer Again, based on discussion with Bill Branan. Currently, the DuraCloud Sync Tool doesn't support this sort of "trickle" synchronization. However, it could support it if there was a way to turn off "auto-delete" in the Sync Tool (so that it would no longer auto-delete content in cloud which has been removed from the local sync folder).

Adding an option to turn off "auto-delete" in DuraCloud Sync Tool?

In the end, it sounds like our best option may be to ask the DuraCloud Team to create a way to turn off the "auto-delete" option of the DuraCloud Sync Tool.

...

  1. DuraCloud will never remove any content, until you explicitly tell it to. This means we may need a "cleanup" script or have an "audit" script which can clean up unnecessary files that still exist in DuraCloud that were removed from DSpace.
  2. DuraCloud will accept updates to files. If you place a file into your sync folder with the name "ITEM-123454678-1.zip", and DuraCloud already has a file of that name in its storage space, DuraCloud will compare the files (via checksum). If they are different, the new file will overwrite the old file.

AIP Export/Import Implementation