You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

What is Kepler ?

In essence, Kepler was designed to model a workflow that will be run repetitively without the need for human intervention.
an effective environment for integrating disparate software component
Kepler is a java-based application that is maintained for the Windows, OSX, and Linux operating systems.
Create sophisticated data analysis pipelines
https://kepler-project.org
based on Ptolemy II system for modeling, simulation, and design of concurrent, real-time, embedded systems.
http://ptolemy.eecs.berkeley.edu/ptolemyII/index.htm

How does it work ?

a graphical user interface and a run­-time engine that can execute workflows either from within the graphical interface or from a command line allowing com­plex tasks to be composed from simpler components, an executable representation of the steps required to generate results. 

Key Features

  • Python actor can be used to create domain- or content-specific scripts for accessing, merging and manipulating data.
  • R and Matlab actors can be used to easily perform complex statistical analyses.
  • A WebSer­vice actor can be used to access and execute WSDL-defined Web services from within a workflow.
  • An ExternalExecution actor can be used to execute command line applications from within a workflow.
  • A number of grid technology actors provide access to web-accessible data repositories and support for parallel processing. 

Issues

Python scripts run in a Jython interpreter within the same JVM as Kepler. This interpreter is only instantiated once so it does not need to be reloaded by each Python actor. On the downside, a global variable set in one actor can cause unexpected side effects if it is reset in another actor. We discovered that if a function is defined as a global (i.e. outside of a class), the method name cannot be reused. The Jython interpreter will always persist the code from the first instantiation of the method. R and Matlab actors require that those applications be installed outside of Kepler. When an associated script is run, the code and data are passed to an external process running R or Matlab.  The actor waits for that process to finish before proceeding. Thus workflows with multiple R/Matlab steps will incur the corresponding application startup overhead for each instance of the actor. This can significantly affect workflow performance. Kepler is a very good tool for building scientific workflows for those who are familiar with software development. However, it is not particularly friendly to those who are used to .  (Google Code project hydrant-kepler discussion of this issue)

For someone with little technical knowledge (i.e. your average scientist), Kepler is hard. As simple as the visual interface for building workflows is, it's still not simple enough. The application is analogous to a programming IDE. It is an excellent tool for building and testing workflows, but when it comes to deploying the workflows to users, simply setting them up with a copy of Kepler and expecting them to run their workflows, which may require tweaking of the actors properties for different runs, is like expecting a user to modify a program's source code each time they want to run the program. Another problem is that, with the exception of workflows that use grid job submission actors, all workflow execution is performed on the user's desktop. This is not an ideal environment for jobs that require heavy computation or long running jobs.     Of particular concern, input parameters need to be entered as strings in non-intuitive interfaces. Any validation of input parameters must be scripted via a Python or R Actor or included in the Java code of a supported Actor..When dealing with a repetitive workflow such as spreadsheet row/column manipulation, there is no support for conditional interaction during a workflow run.  

Kepler 1.0

Python actors use the Jython 2.2.1 interpreter which implements a subset of the Python 2.2 language. See http://www.jython.org/archive/22/index.html for details. The PythonActor is not in the list of availbale actors. In order to use it you must open one of the Python demo worflows and copy one of those actors to your new workflow.

Make sure you save a copy of all your jython scripts in a file external to Kepler. Python actors have a tendency to lose their scripts if they contain a coding error when the workflow is saved.
I agree with y'all that it should be possible to save fully scripted Python Actors for reuse. So I did some more research and found a poorly documented feature that sometimes saves scripted Python Actors in the local Kepler component library. I say sometimes because it may take a few tries to overcome the Actor's tendency to lose track of it's script along the way. On occasion, I have seen similar behavior when copying Python Actors from one workflow to another. I had read of these problems online and that is what caused me to hedge on the topic of saving fully configured actors (and also why I keep a copy of the scripts outside of Kepler).

These irregularities in behavior apparently occur because the PythonActor hasn't been integrated into Kelper's Actor repository yet. The only reason we can use copied PythonActors is because the base class is available in the underlying Ptolemy II modeling engine.

New Java actors cannot be dynamically loaded to an installed instance of Kepler 1.0. Incorporating g a new Java actor requires a rebuild of the entire application and consequent rebuild of the actor database. Thus, a small change to an actor under development requires a complete rebuild of Kepler.

Kepler 2.0 (development version)

Kepler 2.0 is very promising with a lot of features that will help our project but it has proven to be very buggy at times. Python actors use the Jython 2.5.1 interpreter which implements a most of the Python 2.5 language. See http://wiki.python.org/jython/JythonFaq/GeneralInfo#IsJythonthesamelanguageasPython.3F for details. This is a significant improvement of the version of Jython supplied with Kepler 1.0.

Downloads

Kepler 1.0

Download Kepler 1.0 at https://kepler-project.org/users/downloads

Kepler 2.0 (development version)

Download source code from the SVN trunk and use the Kepler build system to run it. See Kepler Build System Instructions in documentation section below for details.

Documentation

Kepler's documentation is somewhat scattered through it's web site so it can be difficult to find just the document that you need. Here is a list of key documents to get you started.

Getting Started Guide - PDF
Kelper User Manual - PDF
Kepler Actor Reference - PDF
Kepler Developer's Reference - wiki page
Kepler Build System Instructions - wiki page
Kepler and Eclipse - wiki page

NOTES

Python actors were heavily used to do the proof of concept because they provide a way to rapidly prototype functionality. Python actors use the Jython version of Python (http://www.jython.org). In addition to providing most standard python functionailty, the Jython interpreter provides access to any Java class or class library available to your JVM.

To learn more about Jython:

  • No labels