You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

The Bridge Intake Service provides data pulled down from the Duracloud Bridge App a path into Chronopolis. It currently handles bagging of snapshots, which includes validating files from the manifest given by Duracloud.

Installation

Prereqs

  • Running Duracloud Bridge App to connect to
  • Staging area for creating bags

Install

  1. Get the latest rpm from http://adaptci01.umiacs.umd.edu/resource/bridge-intake/master/
  2. yum install

Installed files are as follows:

  • /usr/lib/chronopolis/bridge-intake.jar
  • /etc/chronopolis/application.yml
  • /etc/init.d/bridge-intake

When running, the service will check for the following directories and create/apply permissions if they do not match:

  • /var/log/chronopolis/
    • logging data
  • /var/lib/chronopolis/data/
    • journaled data for tokenization

Configuration

Depending on where the data is bound, there are several configuration options available

Chronopolis

Prefix

Specifying Replicating Nodes

DPN

Sample application.yml with all properties

application.yml
# Cron timer for how often the bridge is polled
bridge:
  poll: 0 0 0 * * *

# Bagging Configuration
bag:
  max-size: 2
  unit: GIGABYTE
  dpn:
    node-address: University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093
    node-contact: Sibyl Schaefer
    node-email:
      - sschaefer@ucsd.edu
      - chronopolis-support-l@mailman.ucsd.edu

# General chron configuration
chron:
  node: chron
  bags: /export/bags
  tokens: /export/tokens
  preservation: /data/preservation
  restoration: /export/restore
  prefix: bridge-
  ingest:
    endpoint: http://localhost:8081/
    username: ingest-admin
    password: replace-me
  replicatingTo:
    - ucsd
    - umiacs

# DPN Configuration - the server to use when creating replications
#                     and connection information for the registry
dpnReplicationServer: dpn-staging.ucsd.edu
dpn:
  endpoint: http://localhost:3000/
  username: chron
  api-key: replace-me

# Duracloud Bridge Configuration
# The storage areas the bridge app writes in to
# Connection information to query the bridge
duracloud:
  snapshots: /export/snapshots/
  restores: /export/restore/
  bridge: 
    username: bridge
    password: replace-me
    endpoint: http://localhost:8080/

# Push settings to decide what networks to push the snapshots into
pushDPN: false
pushChronopolis: false

# Logging configuration
logging:
  file: /var/log/bridgeintake/intake.log
  level:
    org.springframework: ERROR
    org.hibernate: ERROR
    org.chronopolis: debug
    org.chronopolis.intake.duracloud.config: trace

# Extraneous settings
#   automatic cleaning of staging areas (not well tested)
#   only perform dry-runs when the Cleaner runs
#   disable SNI on https connections - false is the default and recommended value
cleanerEnabled: false
cleanDryRun: false
disableSNI: false

Release Notes

Release 2.3.1

07 August, 2018

  • Updated logic for appending a prefix to a depositor so that it is used in every operation

Release 2.3.0

10 July, 2018

  • Code licensed under BSD-3
  • Use separate thread pools when executing Ingestion tasks
    • Long IO operations - bagging, cleaning, etc
    • Short IO operations - HTTP requests
  • Bring tokenization process up to date with Chronopolis Core code
  • Test that a depositor exists before attempting to push a Bag to Chronopolis or DPN

Release 2.2.1

18 May, 2018

  • Fix bug where constraint for bag size was checked incorrectly
  • Fix bug where tokens were not being created because a dependency was missing

Release 2.2.0

14 May, 2018

  • Fix bug where we could potentially create the wrong number of replications
  • Add constraint satisfaction when creating DPN Replications
  • Refactor code for Chronopolis and DPN Ingest to better facilitate working on snapshots concurrently
    • Break up DPN Ingestion steps into multiple classes
    • Use standard lib interfaces instead of spring-batch for tasks

Release 2.1.0

8 February, 2018

  • Actively clean storage upon completion of data replicating
    • For chronopolis: Ingest reports PRESERVED
    • For DPN: All replications completed
  • Mark staging as inactive in Ingest for completed replications in Chronopolis

Release 2.0.3

29 November, 2017

  • Query on Bags we created when running Tokenization

Release 2.0.2

9 November, 2017

  • Remove creation of /var/log/chronopolis from rpm and instead create it on application startup

Release 2.0.1

8 November, 2017

  • Use correct properties when creating Ingest Requests to Chronopolis

Release 2.0.0

30 October, 2017

  • Integrate Chronopolis 2.0.0 Changes
    • Import tokenization module in order to create ACE Tokens for incoming data
    • Use new properties classes for configuration of storage and api communication
    • Capture basic metrics when creating bags
    • Update ingest flow to create bag and fixity
  • Updates to build and init scripts
    • Support EL6 and EL7
    • Quality of life changes to the sysv init script based on rpmlint

Release 1.5.1

30 June, 2017

  • Fix bug that caused replications to be created for each node in DPN

Release 1.5.0

26 June, 2017

  • Add a prefix for chronopolis bound collections
  • When creating replications for dpn, choose replicating nodes in a deterministic manner per snapshot so that all bags in a snapshot end up at the same DPN nodes.


  • No labels