Page History

...

DataStaR : Data Staging Repository

Wiki Markup"_The purpose of DataStaR is to support collaboration and data sharing among researchers during the research process, and to promote publishing or archiving data and high-quality metadata to discipline-specific data centers, and/or to Cornell's own digital repository._" (see \[DataStaR: An Institutional Approach to Research Data Curation\|[http://www.iassistdata.org/publications/iq/iq31/iqvol313steinhart.pdf]\])

Requirements

Accessioner's Workbench Requirements

...

Source file :

ChangeLogWriterActor.py

Input

...

port :

changes : ObjectToken containing a Python tuple or Java array with 2 items :
1. the current row number as an integer.
2. a list of changes made.
filename : StringToken containing the fully qualified path for the change log file.

...

Other inputs :

...

The "Change Log Writer" script also needs the fully qualified path for the change log file. This can be acquired in one of two ways:
    A string parameter on the PythonActor named 'action'.
    OR
    A port on the PythonActor named 'action' containing a StringToken.

Output ports :

None

The "Change Log Writer" PythonActor is used in the following workflows:

...

CSVDatastreamDisseminationActor.py

...

Input ports :

...

None

Output port :

dissemination : StringToken containing a single row from the CSV datastream.

...

Source file :

ErrorLogWriterActor.py

...

Input ports :

...

error: ObjectToken containing a Python tuple or Java array with 2 items :
1. the current row number as an integer.
2. a list of errors encountered.
filename : StringToken containing

...

Other inputs :

...

The "Error Log Writer" script also needs the fully qualified path for the error log file. This can be acquired in one of two ways:
    A string parameter on the PythonActor named 'action'.
    OR
    A port on the PythonActor named 'action' containing a StringToken.

Output ports :

None

The "Error Log Writer" PythonActor is used in the following workflows:

...

Source file :

NormalizeDateActor.py

...

Input port :

...

input : ObjectToken containing a Python tuple or Java array with 2 items :
1. the current row number as an integer.
2. an ordered list of columns in the row.

...

Output port :

...

output : ObjectToken containing a Python tuple or Java array with 4 items :
1. the current row number as an integer.
2. a tuple/array of values for each column in the row.
3. a tuple/array of changes made.
4. a tuple/array of errors encountered.

...

Other inputs :

...

The "Normalize Date" script also needs to get a list of indexes for the columns that contain dates. This can be done acquired in one of two ways:

...

A string parameter on the PythonActor named 'indexes'
OR

...

A port named 'indexes' containing a StringToken.
In either case, the string must contain either a comma-separated list of column numbers or a formula describing a regular sequence that can be used to generate the list. The format of the formula is START + INCREMENT * COUNT. For example, the formula 7+4*10 means there are 10 columns in the list, dates occur every 4 columns starting with column 7. This would generate the list 7,11,15,19,23,27,31,35,39,43.

...

Output port :

...

output : ObjectToken containing a Python tuple or Java array with 4 items :
1. the current row number as an integer.
2. a tuple/array of values for each column in the row.
3. a tuple/array of changes made.
4. a tuple/array of errors encountered.

The "Normalize Date" PythonActor is used in the following workflows:

...

Source file :

OutputPrepActor.py

...

Input port :

...

input : ObjectToken containing a Python tuple or Java array with 4 items :
1. the current row number as an integer.
2. a tuple/array of values for each column in the row.
3. a tuple/array of changes made.
4. a tuple/array of errors encountered.

...

Other inputs :

...

The "Normalize Date" script also needs the character to be used as a separator between columns in the output text string. This can be acquired in one of two ways:
    A string parameter on the PythonActor named 'separator'
    OR
    A port named 'separator' containing a StringToken.
In either case, the string must contain a comma-separated list of column numbers.

Output ports :

output : StringToken containg a string representing the ouput row in a CSV file. It is constructed by concatenating the values in the columns array using a separator character.
changes : ObjectToken containing a tuple with 2 items :
1. the current row number as an integer.
2. the tuple/array of changes made received on the input port.
errors : ObjectToken containing a tuple with 2 items :
1. the current row number as an integer.
2. the tuple/array of errors encountered received on the input port.

...

1. .

The "Output Prep" PythonActor is used in the following workflows:

...

Source file :

RowToColumnsActor.py

...

Input port :

...

row: StringToken containing a string representation of a single row in a spreadsheet or other data matrix.

...

Child pages

Versions Compared

Old Version 5

New Version 6

Key

Requirements

Accessioner's Workbench Requirements

Source file :

Input

port :

Other inputs :

Output ports :

Input ports :

Output port :

Source file :

Input ports :

Other inputs :

Output ports :

Source file :

Input port :

Output port :

Other inputs :

Output port :

Source file :

Input port :

Other inputs :

Output ports :

Source file :

Input port :