Wednesday, December 29, 2010

No, labels in Java are not 'evil'... at least not per se!

Last week I got into an argument with some colleagues about the use of labels in Java for escaping nested loops. The general consensus was some along the line of "using break or continue with a label is evil, because it is a goto". While I feel the construct should be applied with care and many instances in which it could be applied a refactoring into e.g. a call to a separate method makes sense, it certainly has its use and cannot simply be deemed evil.

I can think of a couple of sources for the misconception:
  • When Dijkstra published his letter in Communications of the ACM "Go To Statement Considered Harmful" way back in 1968 this led to a lot of controversy too, and somehow only the title of the letter has stuck with a lot of people - but not its original contents nor its true intent.
  • Java has the reserved keyword goto, but doesn't allow its use. James Gosling outlawed it, so it must be bad - nevertheless he did put in the labels.
So I feel that not getting the whole picture is responsible for such misinformedness in some (or maybe even: many) programmers. But hey, don't just take my word for it!

In his book 'Thinking in Java' Bruce Eckel explains that "In Dijkstra’s “goto considered harmful” paper, what he specifically objected to was the labels, not the goto. He observed that the number of bugs seems to increase with the number of labels in a program. Labels and gotos make programs difficult to analyze statically, since it introduces cycles in the program execution graph. Note that Java labels don’t suffer from this problem, since they are constrained in their placement and can’t be used to transfer control in an ad hoc manner. It’s also interesting to note that this is a case where a language feature is made more useful by restricting the power of the statement."

In a (lengthy and by now five-year-old) retrospective of Dijkstra's paper, David Tribble illustrates that goto-like constructs are business-as-usual in modern programming languages without it being apparent al the time, and that constructs mentioned by Dijkstra include not only labels for exiting loops, but also e.g. exception handling (try-catch-finally blocks).
Furthermore he also reaches the conclusion that "Dijkstra's belief that unstructured goto statements are detrimental to good programming is still true. A properly designed language should provide flow control constructs that are powerful enough to deal with almost any programming problem. By the same token, programmers who must use languages that do not provide sufficiently flexible flow control statements should exercise restraint when using unstructured alternatives. This is the Tao of goto: knowing when to use it for good and when not to use it for evil."

Dustin Marx puts it nicely when he says "The more I work in the software development industry, the more convinced I become that there are few absolutes in software development and that extremist positions will almost always be wrong at one point or another. I generally shy away from use of goto or goto-like code, but there are times when it is the best code for the job. Although Java does not have direct goto support, it provides goto-like support that meets most of my relatively infrequent needs for such support."


Now I'm not saying that the above is the conclusive evidence that proves my point. But you may interpret it as an incentive to be a little more openminded when it comes to certain 'conventional wisdoms' surrounding programming...


Update: If you take a look e.g. at this nice article on Java bytecode, specifically the bit about exception handling, you can see what's happening under the hood. That's right, those are just plain vanilla gotos at work when you use a try-catch block!

Wednesday, November 17, 2010

Uploading jBPM .par files (from a repository)

Once you've passed the testing cycles during development, you want to make sure that the processes that get deployed onto the production environment are indeed versions that were released according to your formal build procedure - if you have such in place of course.

In our case, that means that the officially released processes are available from a Maven repository. Now there's nothing wrong with retrieving a newly released process archive and using e.g. the jBPM console to upload it. That is, if there's just one such .par file to upload.

My current project produces no less than 16 process archives, one of which is referenced in multiple locations - so it is not uncommon to have more than 20 process instances started during the course of a single request we're processing.

Now regardless of the question whether we chose the right granularity for our processes (which I think we did, of course), this turned into quite some work for each deployment cycle, keeping track of which .par file was deployed and whether it was in the correct sequence (we're not using late binding for sub-processes). Performing this task had become too error-prone to allow it for the production environment.

Ant to the rescue?

The user guide states that there are three ways to deploy the process archives (they forget about the jBPM console altogether there):
  • The process designer tool; an Eclipse plug-in that is part of JBoss Tools. This is of course not a real option, since we want to be able to deploy process archives without having to start up an IDE.
  • The org.jbpm.ant.DeployProcessTask; an Ant task available from the regular jBPM jar file. While an Ant build actually is a good option for a command-line alternative, this particular task is simply too much: it starts up a complete jBPM context for uploading the process directly to the database, and as such requires all of the applicable configuration. I prefer to have as little direct database access from external hosts as possible (e.g. for security considerations), and this approach doesn't accomplish that.
  • Programmatically; using the jBPM API directly. That is basically just more complex than using the Ant task, so that's not the way to go either (in this case).
So unfortunately these suggestions don't give us the ease-of-use that the jBPM console did, just selecting the .par file and clicking the 'Deploy' button, and we had to search a little further.

Reuse the input method of the designer

A closer look at the GPD designer plug-in shows that its upload functionality is little more than an HTTP client, calling the POST method of the ProcessUploadServlet of the jBPM console. This servlet then uses the functionality of the jBPM API (as mentioned above for the Ant task and the programmatic approach). This entrance into the jBPM deployment is exactly what we need: it's simple in just requiring the .par file to be entered, any database interaction is taken care of by the servlet, any security issues can be addressed by the deployment of the console (see e.g. how that's done in the SOA platform).

So, using Apache's HttpClient library I finally came up with something like the following:
package org.jbpm.par;

import java.io.InputStream;
import java.net.URL;

import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.mime.MultipartEntity;
import org.apache.http.entity.mime.content.ContentBody;
import org.apache.http.entity.mime.content.InputStreamBody;
import org.apache.http.impl.client.DefaultHttpClient;

public class ProcessUploader {
    public static void main(String[] args) {
        HttpClient client = null;
        try {
            // Get the input parms: first the file name, then the URL String for its location (in the repo).
            String fileName = args[0];
            URL url = new URL(args[1]);

            // Prepare the request.
            HttpPost request = new HttpPost("http://localhost:8080/jbpm-console/upload");
            ContentBody body = new InputStreamBody(url.openStream(), "application/x-zip-compressed", fileName);
            MultipartEntity entity = new MultipartEntity();
            entity.addPart("bin", body);
            request.setEntity(entity);

            // Execute the request.
            client = new DefaultHttpClient();
            HttpResponse response = client.execute(request);

            // You can examine the the response further by looking at its contents:
            InputStream is = response.getEntity().getContent(); // And e.g. print it to screen...
        } catch (Exception ex) {
            ex.printStackTrace();
        } finally {
            if (client != null) {
                // Clean up after yourself.
                client.getConnectionManager().shutdown();
            }
        }
    }
}
While this simple example takes a single URL for a .par file (along with the corresponding file name) on the command line, we'll be using the same principle with a standard properties file listing all of the URLs for our process archives and looping through that list executing a request for each file. And these URLs will be pointing to our Maven repository, of course, allowing us to configure the correct versions for each release.

Note that the URL for the upload servlet is hard-coded in the example; if you're uploading your .par files from a different host, you'd want to configure the host on which jBPM runs differently than 'localhost', of course.

Friday, July 30, 2010

Adding task nodes dynamically at runtime

I find that sometimes there's a good reason not to include all possible paths in a process definition at design time. Some of the more generic, non-functional paths can be included dynamically at runtime, in order not to clutter the process definition and be able to focus on the 'real' functionality your process needs to automate.

This goes e.g. for the handling of exceptions, as described here, where an automatic retry is accomplished by adding a transition dynamically from a node in which an exception occurs to itself. You probably don't want to add such 'self-transitions' at design time (that's just butt-ugly).

When you add such a path, it may include a TaskNode at some point. It did for me, and this is how I solved that.

The following code needs to run inside a jBPM context (obviously):

private void createDynamicTaskNode(ProcessInstance procInst, Node originatingNode, Node targetNode) {
        // Add the dynamic task node.
        // - Create the task.
        Task task = new Task("Dynamic task name");
        task.setProcessDefinition(procInst.getProcessDefinition());
        procInst.getTaskMgmtInstance().getTaskMgmtDefinition().addTask(task);
        task.setPooledActorsExpression("Dynamic task executors"); // Or use an actor ID.
        // - Create the node.
        TaskNode taskNode = new TaskNode("Dynamic task node name");
        taskNode.addTask(task); // Adds both ends of the association TaskNode <-> Task.
        procInst.getProcessDefinition().addNode(taskNode); // Adds both ends of the association ProcessDefinition <-> Node.

        // Create transition between originating node and dynamic task node.
        Transition transition = new Transition("Transition to dynamic task node");
        originatingNode.addLeavingTransition(transition);
        taskNode.addArrivingTransition(transition);
        // Create transition between dynamic task node and target node.
        transition = new Transition();
        taskNode.addLeavingTransition(transition);
        targetNode.addArrivingTransition(transition);
    }

Basically it follows the same scenario for creating the node and transitions as jBPM does when it parses the JPDL process definition, using a lot of the defaults involved (such as that the task is blocking and ending it will signal the process instance to continue).
If you needs any of the non-standard options, you may want to read the manual to see what these options can bring you.

Friday, April 9, 2010

Automatic continuations in a jBPM Node

The normal pattern of using a Node would be to execute some Java code from within an Action directly attached to the Node, but as the documentation states, that means this code will also be responsible for continuing the process execution:

"The nodetype node expects one subelement action. The action is executed when the execution arrives in the node. The code you write in the actionhandler can do anything you want but it is also responsible for propagating the execution."

And you may have a different opinion, but I think it's quite tedious to have to repeat the same kind of boiler plate code in each and every ActionHandler implementation used in Nodes, so I wanted to come up with a way to do it more generically.

This standard node type actually gives you a choice as it comes to its execution:
  • as stated above you add an Action (directly to the Node) and have it execute that, or
  • you don't add an Action and have it leave through the default Transition.
How the latter would be useful is beyond me (but that's another discussion), you should be aware of this behavior when you're e.g. attaching Actions to the 'node-enter' event only and not directly to the Node. You'd have a hard time figuring out what happens if you would expect it would be possible to leave the Node through another than the default Transition (it's possible, but you'd have to change the order of the Transitions on runtime, which you probably want to stay away from as far as possible).

The simple approach I chose to illustrate this was to provide an abstract base class, which implements the ActionHandler interface, which has to be extended by all action handlers in your code base. Well, nearly all, but I'll get back to that later. Surely there would be other approaches that will come up with the same result (like annotations or aspects); just knock yourself out.
Such a base class would look something like this:

public abstract class AbstractActionHandler implements ActionHandler {
    protected String transitionName;

    public final void execute(ExecutionContext ctx) throws Exception {
        performAction(ctx);

        if (ctx.getEvent() == null) {
            // When leaving the node we can either have a transition set to be taken or else take the default transition.
            if (StringUtils.isBlank(transitionName)) {
                ctx.getNode().leave(ctx);
            } else {
                ctx.getNode().leave(ctx, transitionName);
            }
        }
    }

    // To be implemented by concrete subclasses; execute the intended Java code and optionally set the transition to be taken.
    public abstract void performAction(ExecutionContext ctx) throws Exception;
}

Now the main 'trick' here is to know when to continue the execution and when not to; as you can tell from the code above this can be derived from the fact whether an event is available in the execution context. At the basis of this fact is the knowledge at which places in a process definition Actions can be added (and when/how they're executed in those instances), and cross-reference that with the required point of continuation (an Action directly in a Node).

You can add an Action at six different places:
  • Directly to a Node: which is what we're talking about here for having automatic continuation.
  • In an event (e.g. 'node-enter'): most of the time that's an explicit event in the process definition.
  • In a Transition: actually then it's executed from within a 'transition' event.
  • In a timer: here it's executed after the 'timer' event is fired, so not within it.
  • In an exception handler: executed from within GraphElement's raiseException(...) method, which also has no event associated with it, but does put the current Exception in the execution context.
  • Directly to the process definition (highest level): these are just for reference from within other elements, so not an 'extra' type in any sense - so we'll just forget about this one for now.
So for the second and third entries of this list the execution context has a current event; the first, fourth and the fifth don't. So the 'trick' as it is used in the above code fragment works for the first three, for the other two (timers and exception handlers) you shouldn't use that particular base class.
It is however possible to extend the 'trick' for these other two instances, by checking the execution context for the availability of a timer (ctx.getTimer() == null) and/or the availability of an exception (ctx.getException() == null) respectively - it depends for which of the cases you want to provide a base class (or mechanism of your choice) in order to have these automatic continuations I was after.

Credit where credit is due: thanks to Arnoud W. for the hint!