Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Stores: The underlying storage provider over which the service will run
  2. Space containing content items: The DuraCloud space in which the content items to be verified reside
  3. Verify integrity of an item list mode
    1. Input listing name: Name of the content item which contains the listing of items over which to run the service

Bit Integrity Checker - Tools

Description:

The Bit Integrity Checker Tools provides provide additional bit integrity checking utilities which can be used to perform specific integrity checking tasks.

...

  1. Mode 1 - Generate integrity information for a Space
    1. Get integrity information from...
      1. The storage provider: Determine the file MD5 by asking the storage provider for its stored MD5 value
      2. The files themselves: Determine the file MD5 by retrieving them from the storage provider and computing the MD5
    2. Stores: The underlying storage provider in which the following space resides
      1. Space containing content items: The DuraCloud space in which the content items to be considered reside
  2. Mode 2 - Generate integrity information for an item list
    1. Get integrity information from...
      1. The storage provider: Determine the file MD5 by asking the storage provider for its stored MD5 value
      2. The files themselves: Determine the file MD5 by retrieving them from the storage provider and computing the MD5
      3. Input listing name: Name of the content item which contains the listing of items over which to run the service
    2. Stores: The underlying storage provider in which the following space resides
      1. Space with input listing: The DuraCloud space in which the input listing file resides
  3. Mode 3 - Compare two integrity reports
    1. Input listing name: Name of the first content item which contains a listing of items to be compared to the second listing
    2. Second input listing name: Name of the second content item which contains a listing of items to be compared to the first listing
    3. Stores: The underlying storage provider in which the following spaces reside
      1. Space with input listing: The DuraCloud space in which the first input listing file resides
      2. Space with second input listing: The DuraCloud space in which the second input listing file resides

...

Bit Integrity Checker - Bulk

Description:

The Bulk Bit Integrity Checker provides a simple way to determine checksums (MD5s) for all content items in any particular space by leveraging an Amazon Hadoop cluster. This service is designed for large datasets (+10GB).

Configuration Options:

  1. Source Space to verify: DuraCloud space where source files are storedDestination Space: DuraCloud space where report file will be placed
  2. Service Mode
    1. Verify integrity of a Space: Retrieves all items in a space, computes the checksum of each, and compares that value with the MD5 value available from the storage provider
    2. Verify integrity from an item list: Retrieves all items listed in the item list, computes the checksum of each, and compares that value with the MD5 value provided in the item list
      1. Space with input listing: The DuraCloud space in which the input listing file resides
      2. Input listing name: Name of the content item which contains the listing of items over which to run the service
  3. Standard vs. Advanced Bulk configuration
    1. Standard allows the user to choose "Optimize for cost" or "Optimize for speed"mode automatically sets up the service to be run
    2. Advanced mode Advanced allows the user to configure the number and type of servers to use that will be used to run the job
      1. Number of Server Instances: The number of servers to use to perform the MD5 generation duplication task.
      2. Type of Server: The type (size) of server used as perform the task. The larger the server, the faster the processing will occur. Larger servers also cost more than smaller servers to run. For more information, see the Amazon EC2 documentation.

...