RecordHandler

A tool to help organize and unify data transactions while outside of an RDF model

This tool is used for the storage of data before it is put into an RDF model. For the initial fetch and post translation the information is stored in this type of structure. The default behavior is to list the record ids contained in the Record Set given to it as input. Optionally, you can inspect/modify a records contents.

RecordHandler Parameters

wordiness - (optional) sets the lowest level of log messages to be displayed to the console. The lower the log level, the more detailed the messages.

Possible Values:

recordId - (optional) specifies the recordId to inspect/modify. If no value parameter is given, the recordscontents will be printed

Example:

value - (optional - only valid when recordId is provided as well) specifies the new value for the record that is identified by the given recordId

Example:

output-file - (optional - not used when value is set) specifies a file to be the output destination for record listing/contents

Example

input-config - (optional - at least one of this and/or inputOverride) the configuration file that describes the input record set. The parameters for this config file are described in the Record Sets section below.

Example:

inputOverride - (optional - at least one of this and/or input) specify the parameters for the record set without a config file and/or override specific parameters from the given config file. The parameters that can be set/overridden are described in the Record Sets section below.

Example:

Record Sets

rhClass - The class for the handler for this record set

Example Values:

TextFileRecordHandler Parameters

fileDir - the directory in which to store the files for each record

Example Values:

JDBCRecordHandler Parameters

dbClass - the JDBC driver class to use

Example Values:

dbUrl - the JDBC connection url

Example Values:

dbUser - the DB username to use

Example Values:

dbPass - the DB password to use

Example Values:

dbTable - (optional) the name of the table to store data in, if non-existent will be created

Example Values:

dataFieldName - (optional) the name of the field to use, if table is non-existent will be created with table

Example Values:

JenaRecordHandler Parameters

dataFieldType - (optional) the predicate to use for storing data properties

Example Values

jenaConfig - (optional - at least one of this and/or full set of params defining a model as described in Models section below) the configuration file that describes the model in which to store data. The parameters for this config file are described in the Models section below.

Example Values:

Recordhandler programmatic view

RecordHandler handles both input and output of records from data repositories. By abstracting CRUD operations for the different data repositories, the system can interact with records in a standardized way.

The variety of record handlers allows flexibility and customization while using the VIVO Harvester. The most frequently used recordhandlers are the TextFileRecordHandler and the JDBCRecordHandler.

Record Class

public class Record

Constructors

Mutators

Accessors

Record Meta Data Class

public class RecordMetaData implements Comparable<RecordMetaData>

Contains a transaction log entry for a write or process operation

Constructors

Accessors

Utilities

Record Handler Abstract Class

public abstract class RecordHandler implements Iterable<Record>

By extending Iterable, you will be able to do fun stuff like:

  RecordHandler recSet = new ExampleRecordHandler();
  for(Record rec : recSet){
    log.trace("===============================================================");
    log.trace("Record "+rec.getID()+":");
    log.trace("---------------------------------------------------------------");
    log.trace(rec.getData());
  }

Mutators

Accessors

Utilities

Map Record Handler

Stores records using Map<String,String> and metadata using Map<String,SortedSet<RecordMetaData>>

Constructors

Mutators

Accessors

Utilities

Text File Record Handler

public class TextFileRecordHandler implements RecordHandler

Stores each record in location fileDir (resolved using apache commons vfs) with filename recID filled with recData. Allows for access to files located in (S)FTP, Local File System, HTTP(S), Temporary Files, Archive Files (Zip, Jar, Tartgz/tbz2, gzip, bzip2), res, ram, mime. Meta data for each record is stored in subdirectory fileDir/.metadata in individual files.

Constructors

Mutators

Accessors

Utilities

JDBC Record Handler

class JDBCRecordHandler implements RecordHandler

Records are stored in a single, 3 column table:

<tableName>

UNIQUE_AUTOGENERATED_ID

recordID

<dataFieldName>

primary_key:int(10)

unique_key:varchar(100)

longtext

<auto_generated_id>

<rec.getID()>

<rec.getData()>

Record meta data is stored in a separate 6 column table:

<tableName>_rmd

UNIQUE_AUTOGENERATED_RMD_ID

record_autogen_id

utc_milli_from_epoch

operation

operator

md5

primary_key:int(10)

foreign_key(tableName):int(10)

index:int(25)

varchar(10)

varchar(100)

int(32)

<auto_generated_id>

<UNIQUE_AUTOGENERATED_ID>

<rmd.getDate().getTimeInMillis()>

<rmd.getOperation()>

<rmd.getOperator().getName()>

<rmd.getMD5()>

JDBCRecordHandler connects to a database located at connLine using the jdbcDriverClass, username, and password. Performs the CRUD operations using the tableName, idFieldName, and dataFieldName.

Constructors

Mutators

Accessors

Utilities

Jena Record Handler

public class JenaRecordHandler implements RecordHandler

Each record is stored as 3 triples:

Subject

Predicate

Object

<UNIQUE_AUTOGENERATED_URI>

rdf:type

ingestNS:record

<UNIQUE_AUTOGENERATED_URI>

ingestNS:idField

<recID>

<UNIQUE_AUTOGENERATED_URI>

<dataFieldType>

<recData>

JenaRecordHandler connects to a jena model named modelName located at connLine using the jdbcDriverClass, username, and password.

Constructors

Mutators

Accessors

Utilities