Chicago and London TDD Styles for Functional Programming

If you’re a Functional Programmer in an interview & an Object Oriented dev asks “Do you FP peeps use Chicago or London style?” the tl;dr; answer is “Both.”

Dave has great coverage of the 2 styles. Let’s cover the nuances as an FP dev… 🧵

FP devs don’t do Behavioral Verification in unit tests. In pure functional languages, you can’t; there isn’t any state to verify. All we do is state verification. While there “is no state” in FP languages, we’re still testing function return values like our OOP brethren.

While it says “State Verification” on the tin, we’re actually verifying return values from functions, not actual state. In pure FP languages, there is no state to verify. Like OOP devs, though, we both care “Did we get a thing back, and are its properties correct?”

FP devs don’t use Mocks, only Stubs in unit tests. You cannot use Mocks because there are no side effects, nor runtime exceptions, nor internal state to store. There are no expectations for Mocks to run because there are no runtime exceptions that would explode your tests.

However, like our OOP homies, we use Dependency Injection to pass Stubs to our functions. We expect these passed Stubs to change the return values. OOP devs expect these Mocks to report if they were used right. FP devs inspect the return value to accomplish the same.

FP devs do behavioral validation in integration/e2e tests. This is where the actual side effects happen and we validate the effects of those side effects. Pure languages like Elm and Roc which have no side effects so you can’t do behavioral validation in the language itself.

London School says all dependencies are collaborators, & should be stubbed/mocked. In pure FP, we usually don’t because they’re all pure functions, have no side effects, are deterministic. We just test the return value. However we still stub sometimes them to make fixtures easier

Both OOP and FP still have the challenge of “creating Big Things”, like a large JSON fixture, a big OOP Class Value Object or FP type/Discriminated Union/Record with a ton of properties and values. A Stub whether an OOP Factory or FP function helps; same goal, same solution.

London Style is careful about these collaborators for many reasons, 2 being external state & more importantly top down design. FP devs don’t have the state problem, but we have exact same design problem. Sometimes top down, especially for monolith AWS Lambdas, helps a lot.

Martin Fowler says in his article Mocks Aren’t Stubs that OOP London Style Mockists struggle thinking about implementation ahead of time. FP devs entire program are pure functions wired together, how it gets that return value is a struggle like OOP devs thinking about behavior.

This is why Chicago style is how most FP tutorials are written, whether for learning pure FP languages or in impure languages like JavaScript/Python. We hand wave away side effects, all functions are pure, easier to test, and maybe only a few have stubs.

This allows FP devs, like OOP, to build the parts like TDD classics from little pure functions that compose each other into a larger program, and voila, we arrive at one that finishes our user story/feature.

Chicago Style is the best way to learn unit testing in FP if you’re a n00b. This is because the rules are the same everywhere:

Step 1: call a function with inputs

Step 2: assert return value is what you expect with those inputs.

No forethought into implementation like London.

FP devs have the same pro/cons w/London Style making it easier to see “how it works”, but not very clear “where exactly the logic failed”. Chicago being verify clear which function is incorrect, but having a larger test surface to fix when we change a type.

I’m guilty of writing less Chicago style tests, and instead allowing the Elm/ReScript compilers to handle most of what a larger Chicago test suite would handle just avoid that “change all the tests later” pain.

This leads to more London style tests that lean on the compiler to find what a Chicago test normally would deep in nested function implementations. For types, great. For logic? Not good.

When Chicago OOP devs say “change in Object”, they’re often talking about mutable state. There is no mutable state in FP languages. However, we’re both interested in the same thing: if I update a Person class/object/type’s Address, did it update the way we expect it too?

In OOP, that’d often by an assertion on the data of the Object itself, using a method of a class to tell us, or a method of the Object instance itself.

In FP, that’d be asserting the data that is returned from the function is what you expect.

Chicago style uses multiple classes from multiple modules. The FP version of that is multiple functions/types from multiple modules. This is where the ambiguity of the “integration” test comes in. To London OOP devs, those other modules/classes should be mocked.

FP devs differ based on language here. Elm, like F#, tends to encourage “a bunch of functions and types in a file”. While Elm supports modules, we don’t really care where it came from; they’re all pure, all deterministic, the compiler tells us if it works.

Thus the idea of stubbing those away for isolation often doesn’t make sense. When testing domain logic, however, like OOP devs who practice Hexagonal/Onion Architecture, absolutely. FP devs have 2 levels of integration; other functions/types & side effects vs London’s definition.

Asking “What is a unit?” to FP devs, you’ll get the following answer: The function I’m calling + whatever types I need. London style will just have more stubbed functions and/or records. This can lead to same problem OOP devs have: too many function stubs.

This has 2 problems: 100% test coverage, passing green tests, but your application doesn’t work. Secondly, the tests can be hard to read because there’s a lot of stub / fixture setup. While Stubs tend to be much smaller, the negative result is the same as lots messy Mocks in OOP.

London style helps a lot in FP when you want to build a feature, & same as OOP, play w/ideas without worry of how it’ll actually be implemented. Your stubs evolve in your tests over time. Instead of Spies in Acceptance Tests, we just have Stubs that coerce a return value we need.

Like OOP, the London style helps us design “where do we want our behavior to go; what types in what functions, where?” where “behavior” just means, the “functions and types that return the value we need”. This style gives you the freedom to improve your design in small steps.

In OOP, that’s deciding what class abstracts what, and who handles what logic. In FP, that’s deciding what function abstracts what, who composes what function(s), & what types we need. Like OOP, these tests may show we may want to create more modules to organize larger sections.

While Chicago is easier to learn & start TDD in FP, it can be harder to build programs. London helps by forcing you to write a test for your function that is the program, similar to the Main, top level class in OOP programs.

London works the same for both styles.

OOP: “I don’t know what I’m building, I’ll just start with a test that asserts our user story works.”

FP: “I don’t know what I’m building, I’ll just start with a test that asserts our user story works”.

OOP often asserts a method or Mock expectation works. FP asserts the return value is what you want. Both helps start us on a path of designing our app, answering high level questions, fleshing out ideas we can play with, and iterate on from there in a red/green/refactor way.

The challenge w/London in OOP is abstracting those boundaries. In FP, you end up with the same problems of “5 stubbed functions to dependency inject my main function”. Unlike Dave, I don’t have a rule of thumb of number of stubs yet.

Even in pure, no side effect languages like Elm & Roc, we try to push side effects to the side, like Gary Bernardt’s Functional Core, Imperative Shell.

This is way harder in pure, side effect languages like Haskell/Elixir/F#.

Harder still in hybrid languages like JavaScript, Python, and Ruby. To Dave’s point, the increase in stubs for pure, no effect languages & stubs for side effectual things in hybrid ones, really makes us constantly revisit our design, and think about the implications.

In conclusion, remember that pure FP languages don’t have behavior verification, so acceptance tests are of utmost importance to get early, unlike OOP which can use Mocks to get moving. This means white box Cypress tests in Elm, and pre/post hooks in ReScript AWS Lambdas.

The benefit of FP here is you get to focus less on weird type/state bugs, and more on correctness. The downside is it’s more upfront work on your CICD pipeline. However, you need both white box and black box tests to clearly distinguish those boundaries.

White box test good? Your good code is good. Black box test fail? An interface outside your world changed. OOP would just update their mocks; for us, same thing, we update our stubs or types. The good news is our FP unit tests aren’t flaky, but our acceptance tests can be.

For Elm and Roc, your correctness tests, often written in Chicago style are the most important to write. However, the London style ensures you aren’t violating YAGNI too much. Once your language has side effects, stubs help a lot here.

If you’re using a hybrid language, everything in Dave’s video applies. Just because you’re using “simpler stubs vs. mocks with built in expectations” doesn’t mean you’re safe from a bad design and getting tests that are unmanageable. Be diligent in your design iterations.