Handling Noop Infrastructure in AWS Step Functions

Spent a month on this and I think I figured out how to get JSON to “roll over” a Step Function step that is a noop (no operation: function that doesn’t return a value), like say AWS Batch. Noop’s (no operation) are steps that basically are full of side effects, but really don’t have a useful return value beyond “SUCCESS”.

If you code Batch in Go for example, the main method has it right there in the return value: void. Meaning… there is no return value.

How, then, do you use something like a noop in your mostly Functional Programming style Step Function that assumes you’re composing infrastructure like you’d compose functions? Basically tap. If you’re not familiar, tap is a quick way to debug pure function chains.

If you’ve ever heard of the identity function, it basically gives you back what you give it, like this in Python:

def identity(oh_yeah):
  return oh_yeahCode language: JavaScript (javascript)

Or this in JavaScript:

const identity = ohYeah =>
  ohYeahCode language: JavaScript (javascript)

That seems like a ridiculous function, no doubt. But watch what happens when you combine it with composed functions, say, parsing some JSON.

def parse(str):
  return json_parse(str) \
  >> parse_people \
  >> filter_humans \
  >> format_namesCode language: JavaScript (javascript)

Parse composes 4 functions together. If they’re pure, how do you “see” inside that pipeline? You “tap” into it. We can take our identity function, and convert to a tap function like this:

def tap(oh_yeah):
  print("oh_yeah:", oh_yeah)
  return oh_yeahCode language: PHP (php)

Then we can “tap into” the pipeline.

def parse(str):
  return json_parse(str) \
  >> tap \
  >> parse_people \
  >> tap \
  >> filter_humans \
  >> tap \
  >> format_namesCode language: JavaScript (javascript)

Sick, ya? Now the normal way to do that in Step Functions is via a Pass state:

"Tap": {
  "Type": "Pass",
  "Next": "NextStep"
}Code language: JavaScript (javascript)

You can then see the output in the Step Function console, or tweak the inputs/outputs if you wish.

… but that’s for Lambdas, Step Functions, SQS, and SNS steps that not only return a result like a pure function, but do so using your JSON. In JavaScript, that’s pretty easy to merge properties:

const addYoSelf = json =>
  ({ …json, name: "Jesse was here" })Code language: JavaScript (javascript)

Batch, and even Step Functions, don’t always work like this. You typically have to define Parameters manually depending upon the input… meaning you lose your JSON tree. 😢

… UNLESS it’s just 1 branch. You can create siblings that are collected in a single spot using the Parallel task.

If you think Promise.all like in JavaScript or gather in Python, multiple results, but a single, Array filled result. This allows you to keep context at least in 1 spot using a Pass, and you don’t care about the rest.

a = stuff
identity = stuff => stuff
addSelf = stuff => ({… stuff, name: 'Jesse'})
destroy = stuff => undefined
end = a => Promise.all([identity(a), addSelf(a), destroy(a)])Code language: JavaScript (javascript)

Notice how end will result in an Array like:
[stuff, stuff + name Jesse, undefined]

Undefined is often what we get with noops that return no value. BUT, we need the first part to continue our Step Function with “context”. Using a Parallel + Pass, you can keep that context.

Until Batch gets the .waitForTaskToken feature (I mean… ECS has it…), you can wrap her in a Parallel to keep context. Just watch those catches….

tl;dr; JSON use Parallel and in next state go:

"OutputPath": "$[0]"Code language: JavaScript (javascript)

BTW, I know you can use ResultPath and just attach to your existing JSON, but AWS Batch is HUGE, and we’re already getting super close to our JSON size limit with our immense Array size I refuse to read from S3. You could just export and then rip off from a Pass state, but you’ll get a size exception before that happens sadly. Twice now I’ve got an exception when it completes the Batch because the JSON is too large, and using this technique has worked every time.

And Step Function JSON for above (supply your own Lambda):

{
  "Comment": "A Hello World example of the Amazon States Language using Pass states",
  "StartAt": "Both",
  "States": {
    "Both": {
      "Type": "Parallel",
      "Next": "Keep JSON",
      "Branches": [
        {
          "StartAt": "Good Function Returns Value",
          "States": {
            "Good Function Returns Value": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:us-east-1:0000000:function:jesse-identity",
              "Parameters": {
                "uno": "yup"
              },
              "ResultPath": "$.datLambdaResult",
              "End": true
            }
          }
        },
        {
          "StartAt": "Bad Noop",
          "States": {
            "Bad Noop": {
              "Type": "Pass",
              "Result": "World",
              "End": true
            }
          }
        }
      ]
    },
    "Keep JSON": {
      "Type": "Pass",
      "OutputPath": "$[0]",
      "End": true
    }
  }
}Code language: JSON / JSON with Comments (json)