...
In essence, Kepler was designed to model a workflow that will be run repetitively without the need for human intervention.
an effective environment for integrating disparate software component
Kepler is a java-based application that is maintained for the Windows, OSX, and Linux operating systems.
Create sophisticated data analysis pipelines
https://kepler-project.org
based on Ptolemy II system for modeling, simulation, and design of concurrent, real-time, embedded systems.
http://ptolemy.eecs.berkeley.edu/ptolemyII/index.htm
A workflow in Kepler is composed of independent actors communicating through well-defined interfaces. An actore presents parameterized operations that act on an input to produce an output. The execution order and communication mechanisms of the actors in the workflow are defined in a director object.
a graphical user interface for composing workflows and editing the workflow environment
a run-time engine that can execute workflows either from within the graphical interface or from a command line allowing complex tasks to be composed from simpler components, an executable representation of the steps required to generate results.
Originally, Kepler was designed to download data sets into a cache on the machine where it is running, so Kepler actors run as local Java threads. However, in order to provide access to web-based resources, actors have been implemented that spawn distributed execution threads to access distributed resources.
Python scripts run in a Jython interpreter within the same JVM as Kepler. This interpreter is only instantiated once so it does not need to be reloaded by each Python actor. On the downside, a global variable set in one actor can cause unexpected side effects if it is reset in another actor. We discovered that if a function is defined as a global (i.e. outside of a class), the method name cannot be reused. The Jython interpreter will always persist the code from the first instantiation of the method. R and Matlab actors require that those applications be installed outside of Kepler. When an associated script is run, the code and data are passed to an external process running R or Matlab. The actor waits for that process to finish before proceeding. Thus workflows with multiple R/Matlab steps will incur the corresponding application startup overhead for each instance of the actor. This can significantly affect workflow performance. Kepler is a very good tool for building scientific workflows for those who are familiar with software development. However, it is not particularly friendly to those who are used to . (Google Code project hydrant-kepler discussion of this issue)
...