Saturday, January 31, 2009

Seeing a workflow in the database doesn’t mean it’s persisted

At work we are building a big application that uses Windows Workflow. We are delivering versions to Test at certain milestones so we don’t have to build everything first before we can see how good we built it.

We ran into an interesting issue I want to share. But first let me tell you about some technical details. As I said, we use Windows Workflow to use workflows in our application. The application receives an order (through a webservice), does dome checking and authenticating and then finds the correct workflow to handle the order and start that workflow. The workflows we make inherit from a base class. This base class contains 2 or 3 dependency properties that every workflow should have (such as the order data) and also contains some methods that every workflow can use. We also have a base class for some activities, but these mainly contain methods that most other activities will need. A workflow also needs to implement a certain interface, so we can load all the workflows with our Dependency Injection container and can determine which interface we need for which order.

So far so good, aside from some other problems I want to blog about soon it all seemed to work perfectly. Our first milestone was a couple of simple workflows that started, did some stuff and ended. Nothing special and they worked perfectly. If you used Workflow Monitor to look in the database you’d see the workflow. So to us that meant the workflow was successfully persisted.

Then for the second milestone we created a bunch of more complex workflows. These workflows did a call to a webservice and had to wait on the call-back from that webservice. In workflow this is handled very easily using the ExternalDataExchangeService and the HandleExternalEventActivity. We coded it up and started testing.

When the call-back came in, some stuff happened like validation, and eventually the call-back data came to the part where we’d ask the workflow runtime to give use a pointer to the workflow instance that’s waiting on the data and then use the ExternalDataExchangeService to deliver the data to the workflow so it can go on with it’s process. But instead of just working, the workflow runtime told use that the requested workflow couldn’t be found in the persistence store. We checked with the Workflow Monitor and we saw the workflow. It’s status set to running (or idle). But the runtime insisted it was not there.

Huh??

I did some digging and finally noticed an error in our log files. Some error about an interface not being serializable which caused the persistence to fail. Turns out that what we see in the Workflow Monitor is Workflow Tracking and not Workflow Persistence. In fact, unless you explicitly tell it to do so, workflows will not be persisted. You can either explicitly tell the Workflow Runtime to persist your workflow, or you can use some activity that will cause it to be persisted. Which is what (as far as I know) most activities do that will cause your workflow to be paused. So in our first milestone nothing caused the workflow to be persisted, but in our second milestone we caused the workflow to pause at the HandleExternalEventActivity which meant the workflow had to be persisted, which then failed.

So what was the problem? Well, all fields and get+set properties on a workflow or activity have to be serializable. So it must have the [Serializable] attribute tacked onto its class or interface and onto every class or interface that is inside that class or interface. Since we set up some references to some logic classes (Logging Service, Order Service, etc) in the base class, they would also need to be serializable.

We solved the problem by refactoring all the service fields into read-only properties (‘get’ properties) and not store the reference in the workflow or activity, but rather ask our Dependency Injection container each time we need one through a static method.
We also removed some other classes that where stored as private fields and just instantiate them within a method as they are needed and a few classes were made serializable.

The biggest downside of this is that Windows Workflow demands a very strict versioning scheme, so this means that we now have a few more assemblies that require careful versioning.

No comments: