Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The "Change Log Writer" PythonActor is used in the following workflows:

  • FCRepoDateNormalizer
  • FCRepoLatitudeNormalizer
  • FCRepoLongitudeNormalizer
  • FCRepoLatLongSplitter

CSV Datastream Dissemination Actor

...

Screenshot :

Required Packages :
  • JyFedoREST
  • FCRepoKepler - Uses SimpleHTMLFormDialog to display and manage the form.
Source file :

CSVDatastreamDisseminationActor.py

...

The "CSV Datastream Dissemination" PythonActor is used in the following workflows:

  • FCRepoDateNormalizer
  • FCRepoLatitudeNormalizer
  • FCRepoLongitudeNormalizer
  • FCRepoLatLongSplitter

Error Log Writer Actor

This actor writes a log file with a summary of errors encountered during the latest run of the workflow.

...

The "Error Log Writer" PythonActor is used in the following workflows:

  • FCRepoDateNormalizer
  • FCRepoLatitudeNormalizer
  • FCRepoLongitudeNormalizer
  • FCRepoLatLongSplitter

Normalize Date Actor

This actor looks at each column that contains a date value and removes extraneous time data when present. It

...

Required Packages :

...

  • FCRepoKepler - script is based on the RowAnalyzer class in fcrepo.kepler.RowAnalyzer.

...

Source file :

...

NormalizeDateActor.py

...

Input port :

...

  • input : ObjectToken containing a Python tuple or Java array with 2 items :
    1. the current row number as an integer.
    2. an ordered list of columns in the row.

...

Other inputs :

...

The "Normalize Date" script also needs a list of indexes for the columns that contain dates. This can be acquired in one of two ways:
    A string parameter on the PythonActor named 'indexes'
    OR
    A port named 'indexes' containing a StringToken.
In either case, the string must contain either a comma-separated list of column numbers or a formula describing a regular sequence that can be used to generate the list. The format of the formula is START + INCREMENT * COUNT. For example, the formula 7+4*10 means there are 10 columns in the list, dates occur every 4 columns starting with column 7. This would generate the list 7,11,15,19,23,27,31,35,39,43.

...

Output port :

...

  • output : ObjectToken containing a Python tuple or Java array with 4 items :
    1. the current row number as an integer.
    2. a tuple/array of values for each column in the row.
    3. a tuple/array of changes made.
    4. a tuple/array of errors encountered.

The "Normalize Date" PythonActor is used in the following workflows:

  • FCRepoDateNormalizer

Normalize Latitude Actor

This actor looks at each column that contains a latitude value and assures that all values are valid and in the same format.

...

Required Packages :

...

  • FCRepoKepler - script is based on the RowAnalyzer class in fcrepo.kepler.RowAnalyzer.
Source file

...

:

NormalizeLatitudeActor.py

...

Input port :

...

  • input : ObjectToken containing a Python tuple or Java array with 2 items :
    1. the current row number as an integer.
    2. an ordered list of columns in the row.

...

Other inputs :

...

The "Normalize Latitude" script also needs a list of indexes for the columns that contain latitudes. This can be acquired in one of two ways:
    A string parameter on the PythonActor named 'indexes'
    OR
    A port named 'indexes' containing a StringToken.
In either case, the string must contain either a comma-separated list of column numbers or a formula describing a regular sequence that can be used to generate the list. The format of the formula is START + INCREMENT * COUNT. For example, the formula 7+4*10 means there are 10 columns in the list, latitudes occur every 4 columns starting with column 7. This would generate the list 7,11,15,19,23,27,31,35,39,43.

...

Output port :

...

  • output : ObjectToken containing a Python tuple or Java array with 4 items :
    1. the current row number as an integer.
    2. a tuple/array of values for each column in the row.
    3. a tuple/array of changes made.
    4. a tuple/array of errors encountered.

The "Normalize Latitude" PythonActor is used in the following workflows:

  • FCRepoLatNormalizer

Normalize Longitude Actor

This actor looks at each column that contains a longitude value and assures that all values are valid and in the same format.

...

Required Packages :

...

  • FCRepoKepler - script is based on the RowAnalyzer class in fcrepo.kepler.RowAnalyzer.

...

Source file :

...

NormalizeLongitudeActorNormalizeDateActor.py

...

Input port :

...

  • input : ObjectToken containing a Python tuple or Java array with 2 items :
    1. the current row number as an integer.
    2. an ordered list of columns in the row.

...

Other inputs :

...

The "Normalize DateLongitude" script also needs a list of indexes for the columns that contain dateslongitudes. This can be acquired in one of two ways:
    A string parameter on the PythonActor named 'indexes'
    OR
    A port named 'indexes' containing a StringToken.
In either case, the string must contain either a comma-separated list of column numbers or a formula describing a regular sequence that can be used to generate the list. The format of the formula is START + INCREMENT * COUNT. For example, the formula 7+4*10 means there are 10 columns in the list, dates longitudes occur every 4 columns starting with column 7. This would generate the list 7,11,15,19,23,27,31,35,39,43.

...

Output port :

...

  • output : ObjectToken containing a Python tuple or Java array with 4 items :
    1. the current row number as an integer.
    2. a tuple/array of values for each column in the row.
    3. a tuple/array of changes made.
    4. a tuple/array of errors encountered.

The "Normalize DateLongitude" PythonActor is used in the following workflows:

  • FCRepoDateNormalizerFCRepoLongNormalizer

Output Prep Actor

This actor sorts through the output created by a RowAnalysis script and routes the data to the proper output writer.

...

Source file :

...

OutputPrepActor.py

...

Input port :

...

  • input : ObjectToken containing a Python tuple or Java array with 4 items :
    1. the current row number as an integer.
    2. a tuple/array of values for each column in the row.
    3. a tuple/array of changes made.
    4. a tuple/array of errors encountered.

...

Other inputs :

...

The "Normalize DateOutput Prep" script also needs the character to be used as a separator between columns in the output text string. This can be acquired in one of two ways:
    A string parameter on the PythonActor named 'separator'
    OR
    A port named 'separator' containing a StringToken. In either case, the string must contain a comma-separated list of column numbers.

Output ports :
  • output : StringToken containg a string representing the ouput row in a CSV file. It is constructed by concatenating the values in the columns array using a separator character.
  • changes : ObjectToken containing a tuple with 2 items :
    1. the current row number as an integer.
    2. the tuple/array of changes made received on the input port.
  • errors : ObjectToken containing a tuple with 2 items :
    1. the current row number as an integer.
    2. the tuple/array of errors encountered received on the input port.

The "Output Prep" PythonActor is used in the following workflows:

  • FCRepoDateNormalizer
  • FCRepoLatitudeNormalizer
  • FCRepoLongitudeNormalizer
  • FCRepoLatLongSplitter

Row To Columns Actor

This actor splits a text string representing a 'row' into 'columns' using a separator character such as ','.

...

Source file :

...

RowToColumnsActor.py

...

Input port :

...

  • row: StringToken containing a string representation of a single row in a spreadsheet or other data matrix.

...

Output port :

...

  • columns - ObjectToken containing a Python tuple or Java array with 2 items :
    1. the current row number as an integer.
    2. a tuple containing an ordered list of values for each column in the row.

...

The "Row To Columns" PythonActor is used in the following workflows:

  • FCRepoDateNormalizer
  • FCRepoLatitudeNormalizer
  • FCRepoLongitudeNormalizer
  • FCRepoLatLongSplitter

Kepler Workflows

Kepler workflows were developed to illustrate how Kepler might be used as an accessioner's workbench.

...

Retrieves a CSV datastream from an object in a Fedora Repository, processes the date column columns to satandardize the standardize their format and saves the results to a local file in CSV format.

...

Screenshots :

...

Image Added

...

Source file :

...

FCRepoDateNormalizer.xml

FCRepoLatitudeNormalizer

Retrieves a CSV datastream from an object in a Fedora Repository, processes latitude columns to standardize their format and saves the results to a local file in CSV format.

...

Screenshots :

...

Image Added Image Added

...

Source file :

...

FCRepoLatNormalizer.xml

FCRepoLongitudeNormalizer

Retrieves a CSV datastream from an object in a Fedora Repository, processes longitude columns to standardize their format and saves the results to a local file in CSV format.

...

Screenshots :

...

Image Added Image Added

...

Source file :

...

FCRepoLongNormalizer.xml