Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

where the third method can usually be converted into the second. One consequence of this is a lack of what one would consider full lifecycle semantics. That is, there is no method by which a task could 'clean itself up' after use. This can entail a few gyrations - or at any rate a certain task design discipline - in certain some circumstances. Let us take a concrete example: a task that needs to write some data to a stream for each object it performs onreceives. The simplest apparent way to code this is:

Code Block
public class StreamTaskTake1 implements CurationTask
{
   private OutputStream out;

   public void init(Curator curator, String taskId) throws IOException
   {
       out = new FileOutputStream("somewhere", true);
   }

   public int perform(DSpaceObject dso) throws IOException
   {
       .....
       out.write(dso.getHandle().getBytes());
       ....
    }
}

but of course this isn't very satisfactory, since the task never closes the stream it opened. The task will never have a has no apparent way of knowing determining when it is called for the last called time, so there isn't an obvious way around this. (There are in fact several ways - e.g. the task can annotate itself as @Distributive and have complete control over how it is called, but this can add substantial complexity). So we are usually led to a solution like this:

Code Block
public class StreamTaskTake2 implements CurationTask
{
   private OutputStream out;

   public void init(Curator curator, String taskId) throws IOException
   {
   }

   public int perform(DSpaceObject dso) throws IOException
   {
       .....
       out = new FileOutputStream("somewhere", true);
       out.write(dso.getHandle().getBytes());
       out.close();
       ....
    }
}

This version is formally correct, and in fact indeed exhibits the quite desirable trait of not holding a file descriptor it isn't usingwhen not in use, but we might chafe at the thought that we are doing fairly inefficient IO if this task is invoked on a collection of 1000 items by re-opening every time. Thus the idea of curator resource management: suppose we could simply ask the curation system to deal with manage the problemissue? Like so:

Code Block
public class StreamTaskTake3 implements CurationTask
{
   private OutputStream out;

   public void init(Curator curator, String taskId) throws IOException
   {
       out = new FileOutputStream("somewhere", true);
       // let the curator worry about this..
       curator.enrollResource(out, "close");
   }

   public int perform(DSpaceObject dso) throws IOException
   {
       .....
       out.write(dso.getHandle().getBytes());
       ....
    }
}

That is, the enrollResource method asks the CS to ensure that when it the curator has finished it's its work, it should call 'out.close()' on the stream. The "close" argument is called the policy, and it is the job of CS to enforce the policy. Currently, we have only looked at 'close' and 'flush' as policies, but it would not be difficult to imagine others.