Onboarding API Sequence Diagram

State Transitions through Sequence Diagrams

This post is my contribution to F# Advent 2018. For years I’ve contributed here and there to a large number of projects, so it is hard to pick a topic. I decided to choose something that cuts across all my various hobby projects through the years and in which I recently found inspiration and practical value when designing software systems, specifically those portions of software systems that want to expose and/or enforce a correct sequence of user actions.

Motivation

I’ve built many systems where the user interaction design was either missing or delayed, both of which led to user frustration or confusion as to how to correctly interact with the software. I’ve also experienced this from both perspectives as a producer and consumer of software library APIs. Most people think it easier to expose all functionality to a user and provide documentation as to how the software should be used. In many cases, this is a pretty good trade-off. Other cases — especially those where a specific sequence of steps should be followed — can lead to misuse, bugs, and lots of frustration.

Many of these issues can be solved by simply thinking through and documenting the desired flow and then implementing that flow within the software. The latter part is often easier said than done, at least in my previous experience.

While at Open FSharp this year, I read a tweet by Mike Amundsen linking to his talk from RESTFest:

I didn’t get to see or hear the talk, but the slides and linked source code clued me in on the concept, which I found very compelling. Here’s my take:

Use sequence diagrams to design resource state transitions, i.e. workflows.

Mike’s slides show a slightly different direction, using the sequence diagram to identify resources, not states, and the transitions between them (though I could be misinterpreting his slides). I found that using each actor line as a state within a single resource made more sense for what I’ve needed to do at work, and the model is much simpler than writing a full-fledged Open API spec. Mike’s slides indicate he’s thinking along these lines, as well as the potential for generating the full specification, documentation, etc. from a simple sequence diagram.

Design

For this post, I’ll stick with Mike’s example so as to remain consistent and progress the conversation. The example is that of an onboarding workflow. I’m going to identify this as an onboarding resource. The first step, then, is to identify the states the onboarding resource may have at any given stage:

home
WIP
customerData
accountData
finalizeWIP
cancelWIP

We then need to identify all the transitions from state to state that we want to support. This is where we immediately diverge from what most API formats allow you to specify and where you can immediately find value from this process. (Note: the funny syntax for the arrows is the convention used by Web Sequence Diagrams (WSD) for specifying sequence diagrams in text format.)

home->+WIP:
WIP->+customerData:
customerData-->-WIP:
WIP->+accountData:
accountData-->-WIP:
WIP-->+finalizeWIP:
finalizeWIP->-home:
WIP-->+cancelWIP:
cancelWIP->-home:

The snippet above shows, for example, that we cannot navigate directly from home to accountData or WIP to home or customerData to finalizeWIP. There are specific transitions that are allowed from each state. Most libraries and frameworks for building applications give you the ability to specify all possible methods, not a means of limiting actions from a given state. This is sufficient to get our first glimpse of a visual representation.

However, we don’t currently have a means of instructing the resource when to move from state to state. We can do this by specifying messages for each transition.

home->+WIP: startOnboarding
WIP->+customerData: collectCustomerData
customerData-->-WIP: saveToWIP
WIP->+accountData: collectAccountData
accountData-->-WIP:saveToWIP
WIP-->+finalizeWIP:completeOnboarding
finalizeWIP->-home:goHome
WIP-->+cancelWIP:abandonOnboarding
cancelWIP->-home:goHome

With these additions, we can see the messages provided to transition from state to state.

Adding parameters provides further context and something we could conceivably call to produce results. The following follows Mike’s formatting from his slides.

home->+WIP: startOnboarding(identifier)
WIP->+customerData: collectCustomerData(identifier,name,email)
customerData-->-WIP: saveToWIP(identifier,name,email)
WIP->+accountData: collectAccountData(identifier,region,discount)
accountData-->-WIP:saveToWIP(identifier,region,discount)
WIP-->+finalizeWIP:completeOnboarding(identifier)
finalizeWIP->-home:goHome
WIP-->+cancelWIP:abandonOnboarding(identifier)
cancelWIP->-home:goHome

Running this specification through the WSD tool produces the following visual rendering, which I find very informative.

Onboarding API Sequence Diagram
Web Sequence Diagram

Implementation

You can get what I’ve presented thus far by reading through Mike’s slides. When I first read through them and visited his code repository, I was hoping to see an implementation of some of the additional directions he proposed. However, to date the repository contains only a CLI for generating the visuals above from the text specifications. With the intent of trying to drive this idea further, I ported the libraries to F#.

In order to show the capabilities of this approach, I planned on implementing several tools:

Unfortunately, life happens and I have not made as much progress as I would like. (Fortunately, I haven’t made too much progress, or you would be settling in for a very long post indeed.) I’m pleased to show a rough proof-of-concept using F# agents that I think demonstrates the utility of this approach nicely.

/// Defines a transition from one state to another state based on a message.
type Transition<'State, 'Message> =
    { FromState : 'State
      Message : 'Message
      ToState : 'State }

/// A resource-oriented agent that transitions the state based on messages received.
type Agent<'State,'Message when 'State : comparison and 'Message : comparison> =
    new : identifier:System.Uri * initState:'State, transitions:Transition<'State, 'Message> list * comparer:('Message * 'Message -> bool) -> Agent<'State,'Message>
    /// Returns the identifier for the agent.
    member Identifier : System.Uri
    /// Retrieves the current state and allowed state transitions.
    member Get : unit -> 'State * Transition<'State,'Message> list
    /// Posts a message to transition from the current state to another state.
    member Post : message:'Message -> unit
    /// Registers a handler to perform a side-effect, e.g. save a value, on a state transition.
    member Subscribe : transition:Transition<'State,'Message> * handler:('Message -> unit) -> unit

With these definitions, we can translate the WSD syntax into types and values.

type State =
    | State of name:string

type Message =
    | Message of name:string * data:string

let createAgent initState =
    let transitions = [
        // home->+WIP: startOnboarding(identifier)
        { FromState = State "home"
          ToState = State "WIP"
          Message = Message("startOnboarding", "identifier") }
        // WIP->+customerData: collectCustomerData(identifier,name,email)
        { FromState = State "WIP"
          ToState = State "customerData"
          Message = Message("collectCustomerData", "identifier,name,email") }
        // customerData-->-WIP: saveToWIP(identifier,name,email)
        { FromState = State "customerData"
          ToState = State "WIP"
          Message = Message("saveToWIP", "identifier,name,email") }
        // WIP->+accountData: collectAccountData(identifier,region,discount)
        { FromState = State "WIP"
          ToState = State "accountData"
          Message = Message("collectAccountData", "identifier,region,discount") }
        // accountData-->-WIP:saveToWIP(identifier,region,discount)
        { FromState = State "accountData"
          ToState = State "WIP"
          Message = Message("saveToWIP", "identifier,region,discount") }
        // WIP-->+finalizeWIP:completeOnboarding(identifier)
        { FromState = State "WIP"
          ToState = State "finalizeWIP"
          Message = Message("completeOnboarding", "identifier") }
        // finalizeWIP->-home:goHome
        { FromState = State "finalizeWIP"
          ToState = State "home"
          Message = Message("goHome", "") }
        // WIP-->+cancelWIP:abandonOnboarding(identifier)
        { FromState = State "WIP"
          ToState = State "cancelWIP"
          Message = Message("abandonOnboarding", "identifier") }
        // cancelWIP->-home:goHome
        { FromState = State "cancelWIP"
          ToState = State "home"
          Message = Message("goHome", "") }
    ]
    Agent(Uri "urn:agent:1", initState, transitions, function (Message(expected,_)), (Message(actual,_)) -> expected = actual)

The F# translation is a bit more verbose than the WSD text format, but the connection can clearly be seen. It’s important to note that the Message type splits the message name and parameters. I’ve kept the parameters as-is for now, but we could easily extend this to use the query parameter portion of URI Templates to specify an expected schema for the parameters against which to validate and extract arguments.

Testing

With these transitions defined, what do we expect to happen? In most apps and APIs, you have access to every supported method or interaction right away. However, we don’t want to expose everything; we want to expose a workflow and restrict user actions. Will the Agent provide the correct response for a given state?

The following tests verify that an Agent in the home state represents that it is in the home state and can only transition to the WIP state with the startOnboarding message.

test "agent starts in 'home' state" {
    let expected = State "home"
    let agent = createAgent (State "home")
    let actual, _ = agent.Get()
    Expect.equal actual expected "Should have been able to transition only to WIP."
}

test "agent can transition to 'WIP' from 'home'" {
    let expected = [
        { FromState = State "home"
          ToState = State "WIP"
          Message = Message("startOnboarding", "identifier") }
    ]
    let agent = createAgent (State "home")
    let _, actual = agent.Get()
    Expect.equal actual expected "Should have been able to transition only to WIP."
}

So far, so good. What happens if we transition to the WIP state?

test "agent transitions to 'WIP' after receiving a message of 'startOnboarding'" {
    let expected =
        State "WIP", [
            { FromState = State "WIP"
              ToState = State "customerData"
              Message = Message("collectCustomerData", "identifier,name,email") }
            { FromState = State "WIP"
              ToState = State "accountData"
              Message = Message("collectAccountData", "identifier,region,discount") }
            { FromState = State "WIP"
              ToState = State "finalizeWIP"
              Message = Message("completeOnboarding", "identifier") }
            { FromState = State "WIP"
              ToState = State "cancelWIP"
              Message = Message("abandonOnboarding", "identifier") }
        ]
    let agent = createAgent (State "home")
    agent.Post(Message("startOnboarding", ""))
    let actual = agent.Get()
    Expect.equal actual expected "Should transition to WIP state with 4 transitions."
}

The Agent represents that it is in the WIP state and can transition from WIP to four other states, just as we specified in our WSD spec. Here are a few more tests for good measure.

test "agent transitions to 'finalizeWIP' after receiving a message of 'completeOnboarding'" {
    let expected =
        State "finalizeWIP", [
            { FromState = State "finalizeWIP"
              ToState = State "home"
              Message = Message("goHome", "") }
        ]
    let agent = createAgent (State "WIP")
    agent.Post(Message("completeOnboarding", ""))
    let actual = agent.Get()
    Expect.equal actual expected "Should transition to finalizeWIP state with 1 transition to home."
}

test "agent transitions to 'home' from 'finalizeWIP' after receiving a message of 'goHome'" {
    let expected =
        State "home", [
            { FromState = State "home"
              ToState = State "WIP"
              Message = Message("startOnboarding", "identifier") }
        ]
    let agent = createAgent (State "finalizeWIP")
    agent.Post(Message("goHome", ""))
    let actual = agent.Get()
    Expect.equal actual expected "Should transition to home state with 1 transition to WIP."
}

test "agent transitions to 'cancelWIP' after receiving a message of 'abandonOnboarding'" {
    let expected =
        State "cancelWIP", [
            { FromState = State "cancelWIP"
              ToState = State "home"
              Message = Message("goHome", "") }
        ]
    let agent = createAgent (State "WIP")
    agent.Post(Message("abandonOnboarding", ""))
    let actual = agent.Get()
    Expect.equal actual expected "Should transition to cancelWIP state with 1 transition to home."
}

test "agent transitions to 'home' from 'cancelWIP' after receiving a message of 'goHome'" {
    let expected =
        State "home", [
            { FromState = State "home"
              ToState = State "WIP"
              Message = Message("startOnboarding", "identifier") }
        ]
    let agent = createAgent (State "cancelWIP")
    agent.Post(Message("goHome", ""))
    let actual = agent.Get()
    Expect.equal actual expected "Should transition to home state with 1 transition to WIP."
}

All the states return the expected representations.

Starting test execution, please wait...

Total tests: 8. Passed: 8. Failed: 0. Skipped: 0.
Test Run Successful.
Test execution time: 1.3494 Seconds

Conclusion

There are clearly a lot more directions in which we could take this. I’m very interested in writing the parser and generating a representation like this, as well as creating some similar implementations for things like Freya, Service Fabric, Azure Functions, etc. I was also surprised at how few lines were required for this implementation (Agent.fs is 73 lines, including white space). I’ve tried doing similar things in the past only to give up because I was overcomplicating the schema format, implementation, or something else.

I hope you’ll give the Sequence Diagram approach a shot, and I would love to know how it does or doesn’t work for you.

You can find all the code for this post at https://github.com/panesofglass/wsd-gen/tree/wsd-agents. Thanks for reading, and Merry Christmas!


Advertisements

Demand Driven Architecture or REST or Linked Data?

I recently listened to David Nolen‘s talk from QCon London conference from back in July called Demand Driven Architecture. Before continuing, you should have a listen.

Ready?

I really like a lot of things Mr. Nolen has done and really enjoy most of his talks and posts. I was less enthused with this one. I think my main hang up was his mis-representation of REST and resources. I get the feeling he equates resources with data stores. If you watched the video and then skimmed that Wikipedia page, you will quickly see that the notion of “joining” two resources is nonsensical. I think Mr. Nolen is really referring to that “pragmatic” definition that means POX + HTTP methods, which really would correlate well to data stores.

Real REST drives application state, so you would not need to join to anything else if using REST. His criticism of performance also misses the mark, for if you are using REST then you should also be carefully planning out and leveraging a caching strategy. REST isn’t appropriate for every application, but not for the reasons Mr. Nolen so casually dismisses it. Think about it this way: if REST is not suitable for performance or mobile devices, then you must also agree that all websites (not apps, just sites like Amazon.com) fail on mobile devices. That’s just absurd.

I can’t imagine anyone is surprised when Mr. Nolen mentions SQL. He’s just mentioned joins, and what else would a developer associate with that term? If you have hung around the web long enough, you may have heard of SPARQL, which is a query language for linked data. Linked Data and SPARQL never seem to have caught on, but they address at least part of the problem Mr. Nolen presents. A big part of Linked Data’s failing is lack of tooling and painful implementation (mostly related to RDF). Perhaps new tools like BrightstarDB will help turn things around.

Interestingly, Mr. Nolen’s solution, SQL, is strong at ad hoc querying but not so great at scalability. If the web, linked data, etc. were already addressed by SQL, then those technologies would not exist. I really don’t get the need for mis-representation and indirection here. (On a related note, Fabian Pascal recently posted on the “Conflation & Logical-Physical Confusion” surrounding mis-representing SQL as the relational model, which ties in well to this post.)

The only thing presented here is an alternative to REST where the client specifies dependencies for the server to fulfill. This is a flip of the REST style where the server drives application state and provides contracts via media types to a client-driven approach. This is a perfectly valid approach, and interestingly one where Linked Data could excel. Tools such as Datomic, GraphQL + Relay, and Falcor certainly look interesting and appear to work well for very large projects.

I have no doubt that any of these techniques, done well, provides excellent results. Tooling will likely determine the winner, for better or worse.

The Origin of RESTful URLs

For at least the past year, I have repeatedly found my appreciation for the literal offended by the term “RESTful URLs.” I recently spent a bit of time trying to explain how this term is oxymoronic on the Web API Forums. While URIs are important as a means of identifying unique resources, REST doesn’t specify any other requirements for URIs. I often note that a RESTful URI could be a GUID. While this is certainly not very meaningful to humans, it satisfies the REST constraints.

While pondering nested resources in Web API yet again, I realized where I think this term originates. Ruby on Rails used the term “RESTful routing” to describe their approach to building Controllers along the line of resources and using a convention to route controllers in a hierarchy based on the controller name. The goal, I think, was to correctly model and mount resources at unique URIs. However, you get what I would call “pretty URLs” for free.

If you use the term “RESTful URLs,” please stop. While RESTful Routing makes sense, RESTful URLs are just nonsense. Do use “RESTful Routing.” Do use “Pretty URLs.” Just don’t confuse your terms. Thanks!

New Names for Old Things

[This is the third in a series started long ago on the use of MVC for building web “applications”.]

I’m glad I’m only getting back to this series now. I’ve had an opportunity to build many more web applications and have a much better appreciation for the poor terminology used to define web applications. For starters, this MV? business silly. We’ll get to that.

I know I’m a bit of an extremist in some things. Specifically, I like things to mean what they mean. When we abuse terms, we don’t communicate well. REST. There, I said it. I feel better. Stop using the term. Most people have a wrong idea of what it means b/c of all the silliness that has been done in its name. I don’t claim to know exactly myself. I don’t think it’s possible to rescue the term from the abuses heaped upon it. There, you see? I’m an extremist.

Now that we’ve covered that, on to MVC. I’m not sure who decided this was an accurate description for what happens on the server-side of the web, but it’s just flat wrong. As noted previously, HTTP uses a functional interface. It’s an IO-bound Request -> Response function. Can you use patterns on either side to help maintainability? Certainly! Just don’t confuse things. Let’s start with Views.

Views

What is a view?

The [view or viewport] is responsible for mapping graphics onto a device.
A viewport typically has a one to one correspondence with a display surface
and knows how to render to it. A viewport attaches to a model and renders
its contents to the display surface. In addition, when the model changes,
the viewport automatically redraws the affected part of the image to reflect
those changes. […] there can be multiple viewports onto the same model and
each of these viewports can render the contents of the model to a different
display surface.

If a view was merely a serialization of a model, this would make sense for building web applications. Unfortunately, there’s a problem. The definition suggests that the view automatically updates whenever the model changes. How do you do that with HTTP? HTTP doesn’t define any mechanism for hooking up observation of a server model. Before you say JavaScript, consider first the current use of View, or even UI. People commonly mean HTML. HTML is not a UI. HTML is a serialization format. The client (normally a browser) must interpret that HTML. Many of you will remember when that wasn’t so standard.

Can we achieve MVC today? Possibly. You might be able to leverage web sockets to reach across a client/server architecture such as that presented by HTTP. However, you are more likely to find that “MVC” on the server is just limiting. You are typically better off building a sort of restricted data access service, a.k.a Web API (subtle hint). There’s really no point in trying to enrich a serialization format to make it work more like true MVC across the client and server.

Controllers

This is no different than routing. Instead of calling your Router a Controller, you split them up. However, most frameworks really just use the router as a top level dispatcher and the controller as a lower-level dispatcher. Otherwise, I’d say web frameworks stay a lot closer to the original meaning than a lot of the other MV? patterns. (Hence the ?, of course.)

Models

This really is the crux. HTML is a model. I noted this last time. It’s just a serialization of data you want displayed. It happens to be a lot richer, but it’s still just a data model. HTML is a great way to bootstrap an application that otherwise uses JavaScript as a model serialization format. If you want to disagree, ask why HTML5 removes the presentation elements. Why has layout and style moved to CSS? CSS and the browser define the actual rendering. In a no-script web application, you don’t have to build a view. You get it for free.

Conclusion

So what? Am I just ranting that I don’t like how people abuse terms? Possibly. However, I think this goes deeper. When you allow the slippery slope, you get caught on it, as well. It’s inevitable. The bigger, lurking danger is that we start to confuse useful patterns and use them in the wrong places. Many people use MVC frameworks today to build web APIs. However, that’s not MVC. So if you then switch to a desktop app to write MVC applications, you are either confused or delighted to find that it’s so much richer.

I don’t know what I would call what we build for the web; I know I wouldn’t call it MVC. In my experiments with Frank, I’ve found that writing simple functions and nesting them with rules makes a very easy mechanism for building web APIs. I think that would essentially just be a Command pattern. Simple, elegant, and very easy to understand. YMMV.

Web Architecture Done Right

I’ll go ahead and confess that a single right way to design for the web doesn’t exist. If someone wants you to believe otherwise, they are just wrong. That said, I do think that you’ll have a hard time going wrong by starting with one simple rule: start with a web api.

Why should you start with a web api rather than just building a web site/app that has HTML and JavaScript all working together? Frankly it’s because you can’t properly decouple the api from the serialization format well enough. When you are designing an api for the web, you (should be) thinking in terms of resources and representations. That’s representations in the plural form. You may not know exactly what forms you’ll need, but you should consider that you will eventually have many. When you start with a web site/app with HTML in mind, you’ve coupled yourself to a single format, and extracting that out later could (will) be difficult. Don’t just take my word for it. Mike Amundsen has an excellent post on the right way to think about these things.

There are certainly some instances where this may be overkill, but I hate rewriting software unless it really is a prototype or just an exploratory attempt to get something up. In those cases, go for the quick and dirty. If you are working on something you want to last a long time, however, you owe it to yourself to consider the evolvability of your app by focusing on api design. You’ll then be able to take advantage of a number of client options. Certainly, supporting the growing number of clients is one of the biggest challenges continuing to face developers as we head into 2012.

Let’s suppose you agree with me on this point. How do you go about building a really solid api design? I don’t think I could articulate it better than Darrel Miller already has. His goals for good apis are suitable both for internal teams and external customers. Who wouldn’t love gaining visibility b/c a customer was able to accelerate their business by using your api in an unforeseen way that drives additional business for your own company? How nice is it to knock out not just one project but several b/c you are able to leverage existing platforms for new projects? We’re doing that at Logos, in large part because we moved to MVC and took a more service-oriented approach to building our apps. The number of new projects has grown tremendously, but we are also able to respond much more quickly b/c the services are ready for consumption.

I’ll be continuing to discuss this topic in future posts. In the meantime, check out Mike’s book,
Building Hypermedia APIs with HTML5 and Node. It uses HTML5 and Node to illustrate, but the concepts are excellent and portable to other platforms. I also highly recommend REST in Practice as an excellent resource for understanding the fullness of what HTTP offers for building apis. Enjoy!

Re: unREST

Tried commenting on the unREST blog post, but I kept getting an error that I was trying to post illegal content.

  1. You equate the HTTP version of REST with REST itself. That is inaccurate. I think what you don’t like is HTTP.

  2. I’ve never understood why SOAP didn’t drive directly off of TCP. It doesn’t really use HTTP except for transport, which is really useless.

  3. I’m all for different application protocols. Build your own indeed. You are quite right that, in many cases, another person’s uniform interface is not the best. These can be REST or not, that’s really up to you.

  4. I find REST breathtakingly simple. I like it, so I apparently disagree with you, and I’m alright with that.