The REGAL Architecture

Introduction

The REGAL Architecture is a tech stack brought together using industry best practices and modern tooling for building web applications. If you want to build, deploy, and host a full stack web application with little to no runtime exceptions, minuscule downtime, and confident, yet reasonably fast iteration, REGAL is for you. Nothing is invented here, it’s just a combination of existing technology brought together in a cohesive tech stack.

REGAL stands for the core technologies used in the tech stack:

It’s early days, but we’ve successfully deployed 2 applications to production at my job, so there is enough information and repeatable patterns that will help other teams. Additionally, there are some exciting technologies coming that can improve the architecture in the future.

In this post, we’ll cover:

  1. what the REGAL stack is
  2. why each technology was chosen
  3. how each fits together in a cohesive architecture
  4. what options you have for changing certain parts and possible pro/con effects
  5. Comparison to other tech stacks to help with context & understanding options
  6. Future improvements

While you may not adopt, or even agree with the entire stack, I guarantee you that you’ll find either some technology or practices within that you can adopt to make your current software projects better.

The REGAL 5 Philosophy

Despite what you read on social media, blogs, and books saying programming is trying to be like engineering, programming is actually an art. As such, even the most talented programmers I’ve met have philosophies about how they approach programming, and these drive all their technical choices. These beliefs can also change over time. We have 8,000+ programming languages, existing ones are modified, and new ones are continually created. People want to create in a way that feels right to them. Keyword “feels”. As we change, programming changes, and in turn it changes us, and the cycle repeats.

While qualitative data is legit, so too should we take a rigorous, scientific engineering approach to what we do as best we can. It’s hard to get good data, and have that data be corroborated, in programming. Meaning there aren’t many studies that are conclusive and corroborated enough that everyone in the programming community goes “Oh, that’s a fact, we should do that”. But we DO have those doing good studies and sharing that data. We should try to incorporate what we can.

REGAL combines both my philosophy on programming as well as industry best practices based on what data we have. These are, in order:

  1. Fast Iteration: Speed is key
  2. Correctness: Code when compiled should be as correct as possible
  3. Strictly Typed Functional: we practice soundly typed functional programming and apply that thinking to our architectures vs. procedural, object oriented, and lack of types
  4. Serverless First: we architect with serverless architecture in mind over statefull, serverfull architectures
  5. Trunk Based CICD: we believe in trunk based continuous integration with heavy automated testing to ensure we can continuously deliver to production multiple times a day, confidently.

Let’s cover each concept and how it applies to the technologies chosen, and not chosen.

Fast Iteration

One of the most effective ways to develop software well is fast feedback loops. Write some code, test it, learn, repeat. The slower it takes to see 1 of line code work both locally and on a development environment, the worse your ability to write good code becomes. “Feedback Looks” are well discussed here in Tim Cochran’s article on Maximizing Developer Effectiveness from Thoughtworks.

Nothing makes a code base more horrible to work with than a slow feedback loop, regardless of any other positive. As such, I view it as the most important, and thus it’s number 1.

Functional languages like Haskell, Rust, Elixir, and Scala tick the Correctness, Functional first philosophies. However, they’re build times & deployment methodologies are are slow. They’re technically “fast” if you speak to a developer experienced in those technologies, but they’re too slow for me. Haskell and Rust have notoriously slow compile times. It’s extremely common to deploy all 4 to servers vs. serverless architectures, which often uses Docker which is also slow. Utilizing languages like TypeScript in a functional way can be done, but TypeScript compilation is notoriously slow on larger code bases or when using types over interfaces, hence the rise of native alternatives like Vite, or built-in support such as Deno and Bun.

ReScript uses OCAML under the hood which is the fastest compiler in the world. It also has prior art of positively impacting communities, such as the MTASC compiler in the ActionScript of days past. As long as you ensure all your functions eventually have type definitions on top, Elm is also near instant. Although CloudFormation is notoriously slow, you can utilize the AWS SDK, whether in various frameworks like Serverless’s deploy function, or AWS SAM’s Accelerate to deploy your code to a QA environment in seconds. The ramp up time for new Lambda functions to be operational in an eventually consistent way is also near instant to seconds. This includes being linked to an existing AppSync setup. Finally, Amplify’s cache busting for deploying a new UI change is near instant as well.

EC2’s take minutes to spin up, ECS/EKS take many minutes to bring new containers/pods online, and rollbacks are slow as well.

Finally, utilizing Node.js with limited libraries means your Lambda deployment artifacts are small and fast to deploy and do not slow down your Lambda starts. Admittedly, this is a harder thing to do in the Node.js ecosystem as libraries are both it’s strength (there are many) and weakness (you tend to use many).

Correctness

There are basically 3 types of errors:

  • syntax
  • null pointers
  • logic

Syntax errors are notorious in dynamic languages since there is no compiler. Thus you write code, have no idea if it’s right, and just run it to see if it is. The pro’s here is dynamic languages tend to be fast so this process of “does it work? does it work? does it work?” can be done many times a minute. As code bases grow, however, this starts to get tedious and error prone. Many languages have linting tools to help with this like ESLint for JavaScript and Black for Python which adds tooling complexity, but with the pro meaning you may keep the same speed of iteration.

If your code as a definite shape, a compiler can help you. Things like Haskell, Scala/Gleam, TypeScript, Python Typings with mypy, and Rust have really good compilers to get you “close” to correctness. Meaning, you’ll no longer have syntax errors when you run your code.

Null pointers, however, are trickier. These are just the status quo for most languages, and many either don’t even try (Python/JavaScript/Lua), make it your problem (Go, Java, C#), try to provide alternatives to null (OCAML, F#, ReScript, Rust), or just don’t include them at all in the language making them impossible (Elm, Haskell). You only get a correctness guarantee when you compile and the code works. Null pointers surprising you does not count in our philosophy of working and being correct.

Elm offers that for the UI. ReScript doesn’t offer that for the server, but is close, and way safer than TypeScript/Python/Elixir. Rust could work but we’re building BFF’s or serverless architectures, not requiring any of the low-level features Rust provides. It’d be better to use Haskell or Roc. We’ll explain more on why those weren’t chosen later.

Lastly, if you remove syntax errors and null pointers, the only thing left is logic errors. Examples include “if we arrive at 5:00pm and the plane’s final boarding is at 5:00pm, why is it saying we’re late?” This is because the function doesn’t include inclusive numbering; it uses greater than (>) instead of greater than or equal (>=). Simple mistake, but the entire application has no syntax errors, no null pointers, but also doesn’t work correctly. These are what we should be spending our time testing, both automated and manually, because they’re the most difficult to get right, and what we hopefully often iterate on to get better at. Elm satisfies that and ReScript mostly satisfies that.

Remember, regardless of tooling, a code base is only as correct as a programmer is capable of articulating what “correct” means, and we’re horrible at that. Best to focus on those problems, and let tooling fix the other 2 (syntax and null pointers) so they’re no longer a problem anymore and we can focus on the important stuff.

Strictly Typed Functional

Pure functions are predictable; they either work or they don’t, and their straightforward to unit test. We like those and we like testing.

Immutable data is predictable, removes a whole set of mutation bugs, and aligns with how you architect things in serverless/reactive architectures. We like that alignment.

Types allow you to remove a whole set of common bugs and provide a language to model things given there are no classes in functional programming. Functional types, however, are more likely to provide correctness guarantees compared to ones like in Go, Java, C#, or TypeScript. We like a host of bugs never happening and modelling data.

Obviously Go, Java, C#, Python, and TypeScript are obviously out, being based around allowing things like untyped Object and null pointers. TypeScript has some wonderful functional types, but there are too many escape hatches implicit in the language, resulting in things being unsafe. TypeScript’s also made wonderful in-roads in supporting record types as well as sum types with reasonable strictness checking when you pattern match use them in a switch statement with strict on. However, TypeScript overall is very OOP focused, and the types are quite verbose for basic modelling when compared to Elm or ReScript.

ReScript, Elm, Rust, Haskell, Idris, and Scala using something like Zio/Cats/Salaz all offer wonderful sound (as opposed to strict) typing capabilities. Elixir is functional, but has no types (yet, there is a proposal for it), but Gleam does. F# and OCAML have extremely good types, but still allow classes and null pointers and exceptions “if you want too”. bruh. Sadly so does ReScript.

Elm and ReScript, as well as GraphQL, allow an extremely similar looking syntax to define both records (and) and sum types (or). This means your types are safe, as well as readable across the tech stack layers with not as much context switching. Remember, types are also a burden.

We like “when it compiles, it works”. By “works” we mean, any bugs that crop up are logic ones, not avoidable syntax or null pointers.

Serverless First

You either like mucking around with infrastructure, or you don’t. I don’t. I enjoy outsourcing uptime to AWS so I can focus on coding. To benefit most from that, it helps to learn how best to architect using serverless technologies. REGAL’s core philosophy is “you’re deploying to AWS so it can manage your UI and API”.

Trunk Based CICD

Feature Branches works for distributed teams who don’t know, and/or trust each other. Trunk based works for those that do know and trust each other. PR’s slow down building code and create many sources of truth. Trunk based means you aren’t waiting on anyone, git source control eases merging, and all developers have 1 source of code truth.

This also means you should be making many frequent commits a day. Elm & ReScript’s compilers, while fast and good, both encourage small, incremental changes. Elm popularized fearless refactoring, while ReScript certainly comes close. Combine this with automated unit, integration, and end to end tests in your pipeline, and you have a more confident way of getting to production many times a day. Practices like Test Driven Development & Pair Programming are encouraged, but not required.

It’s assumed the basics of CICD are followed:

  • many commits by a developer per day
  • all code quality checks are automated
  • this includes linting, testing, security, and deploying to various environments
  • rollbacks and/or green/blue or canary builds should be a simple, single button push
  • developers are continuously merging and pushing code, and the main branch is the source of truth
  • its ok if a production push requires a manual approval step
  • There are no Peer Review/Merge Requests; code is continually reviewed by developers on their own as well as in pair programming and mob sessions

The REGAL Technology Stack Choices

Let’s go in order of importance vs. letter position.

GraphQL

There are many choices creating client server architectures. REST, gRPC, etc. GraphQL was chosen for the following reasons:

  1. it uses a modern typing system consisting of ands (records) and ors (enums & unions) types allowing for rich domain modelling in a functional way.
  2. Modelling your business domain using the real language of the business borrows ideas from Domain Driven Design (not the copious amounts of code and abstraction) to meet the “Correctness” philosophy. The should be 1 source of truth for “what a thing is”, and using GraphQL ensures both the client and the server agree.
  3. Domain Modelling is important, but so too are compiler and runtime type checking errors. GraphQL ensures that the types on the client are the same as the server. Many mistakes are easy to making when you go “outside your type system”, and REST is notorious for this. GraphQL ensures the types are the same, whether on the client, on the server, or over the wire.
  4. AWS has a managed hosting GraphQL offering that aligns with the Serverless First Philosophy
  5. It’s atop REST so we can still use browser based REST tools to debug some of our queries. GraphQL is easier to read than encoded Protobuf.

AWS AppSync

AWS AppSync is their managed hosting of a GraphQL API. Instead of writing some kind of server like Apollo, then wrapping in a container and deploying so ECS/K8’s, you can instead just “let AWS handle all that”.

AppSync does a lot, but the valuable things are as follows:

  1. serverless GraphQL server
  2. hosts your schema and validates it for incoming and outgoing requests with common errors (no code is required, this just works)
  3. provides 3 basic forms of authentication including token, “whatever your Lambda says”, and JSON Web Token with automatic Authorization header parsing
  4. options to link your schema’s queries and mutations to Lambda(s)
  5. options to link your schema’s individual fields to various data sources including Lambda, DynamoDB, HTTP, and others.
  6. Automatically includes a CloudFront distribution with optional Route53 integration.
  7. All Lambdas get enough query information in their event Object to determine any context you need about the query, how to respond, and with what data.
  8. Built-in data or session base caching with ability to invalidate.

Not having to host and babysit a server, manually wire up CloudFront/CloudFlare, and writing various authentication and routing code in Apollo is amazing. Having it be managed by AWS, and serverless by default is even more amazing. This is truly the BFF (back-end for front-end) of the future.

AWS Amplify

If you have a BFF, that means you have a front-end. If you have a front-end, you gotta host it somewhere. Amplify is an AWS managed service built for hosting Single Page Applications. It abstracts away all the existing serverless tech you’d traditionally use on AWS into a single place, automating most of it. S3 static assets, cache busting on deploy, and it even abstracts its own build pipeline using CodeDeploy sourced right from your code repository. Like AppSync, it creates a CloudFront distribution for you, and optionally provides automatic Route53 creation if you want at full URL.

There are other various features, but the only one that really matters is “I need to host a website on AWS and want it to be easy to setup”. If you’ve ever reached the limit of public assets on S3 and had to go through ALB’s and EC2’s… Amplify is a massive breathe of fresh air.

If you’re not in a highly regulated environment, the Amplify CLI allows you to create a monorepo of both Amplify and AppSync using the commandline.

Some caveats on CodeDeploy & AppSync below.

AWS Lambda

Following the Serverless First philosophy combined with Functional Thinking, Lambda ticks both boxes. AWS Lambda manages your code; you just turn a bunch of scaling and configuration knobs. No servers to babysit, no containers to manage, just code.

And by “just code” we mean “just functions”. Functional Programming is all about worshiping at the “Church of Pure Functions”, and we do our best to make sure our Lambdas have a functional core with an imperative shell. The AWS Lambda contract is as follows:

  • the input is whatever the trigger is (in this case, most are AppSync giving it a GraphQL query)
  • the output is either the GraphQL query type response, error(s), or both.
  • An exception

AWS handles runtime errors different ways depending on the trigger, but they’re encouraged in most services (AppSync is kind of the exception to the rule here). Traditionally, things like SQS, SNS, or Kinesis use a runtime exception as a signal to retry. API Gateway/ALB’s use them as a signal for a 500 response. Step Functions have the most freedom.

AppSync, however, is interesting because it’ll eventually become an HTTP response so you’ll most likely get a 500. However, the GraphQL spec allows both a GraphQL query data response and an error in the response.

In Functional Programming, you don’t have exceptions, but a Result or an Either being returned from the function. While returning both might not make sense, it’s possible by returning an Array or Tuple containing both, similar to how Go and Lua can return both data and error in a single function call.

The FP thinking philosophy works here because we do NOT want to ever throw exceptions, and instead return results that AppSync can understand. This is much easier in FP languages. However, ReScript has caveats which we’ll discuss later. Despite this, Lambda being based on “Lambda calculus”, a pure function with a single input with a single output and no side-effects, is like a religious calling to Functional Programmers to get into Serverless.

This ensures all of your back-end calls aren’t in a giant Express.js style codebase, but rather in a small set of microservices of reasonably sized functions that can be in a monorepo, but individually deployed & tested.

ReScript

ReScript is probably the only part of the REGAL tech stack I’m not fully content with. I’ve written and spoken about why I’ve chosen ReScript over other alternatives. However, it is first in the name, and forms the 2nd most important position in the tech stack, so I’m committed.

ReScript is a language and compiler. You write in ReScript and it compiles to JavaScript, much like you write in TypeScript and it compiles to JavaScript. Like TypeScript, ReScript supports integration with existing, or new, JavaScript code at runtime. ReScript was chosen for the following reasons:

  1. next to OCAML, it’s the fastest compiler on the planet; this follows the Fast Iteration philosophy.
  2. It’s soundly typed functional programming; this follows the Functional Programming and Correctness philosophies.
  3. It compiles to JavaScript which allows us to use Node.js based Lambdas which, next to Python, are some of the fastest to run (for low-latency, short time based functions; for batch or workers, I’d much prefer Go/Rust/F#). This also allows us to leverage the swath of JavaScript libraries out there including the AWS SDK for Node.js. This greatly helps the Serverless First philosophy.

Scala uses the JVM and while it is natively supported, the JVM is too slow & heavyweight for small functions. Rust’s compiler is too slow and we are mostly doing I/O calls with simple text parsing so not using Rust’s low-level capabilities. F# is amazing and .NET is natively supported, but again, .NET isn’t as fast as Node.js, the F# documentation for both the language and the AWS SDK is horrible, and the AWS SDK for F# is basically bindings on C# as far as I can tell. Haskell requires a custom runtime (serverless first, remember, no containers). Same for OCAML. TypeScript has many FP facilities & libraries, but its compiler is slow and the types are more verbose for less guarantees.

Elm

Elm is a language, compiler, and package manager for building web applications. You write in Elm, and it compiles to JavaScript and HTML. If I could use Elm on the back-end, I would, but I can’t, so I use ReScript there.

Elm was chosen for the following reasons:

  1. Fast compiler. I could use TypeScript with FP libraries, but it’s slow. As long as you use type definitions on all your functions, Elm remains fast. This follows the Fast Iteration philosophy.
  2. Soundly typed functional language. This ticks the Functional Thinking philosophy.
  3. No runtime errors or null pointer exceptions. This ticks the Correctness philosophy.
  4. The compiler is so good, it enables “fearless refactoring”. While you should do small, incremental changes, you certainly don’t have too. This works well with the Trunk Based CICD philosophy because many developers can make copious changes to the code base and the compiler has your back.
  5. The elm-graphql library enables code generation. This enables us to extend the “fearless refactoring” across the entire stack. As we build, learn, and eventually change our domain model from those learning’s, we can regenerate the front-end code needed, and the compiler lets us know what to modify (I still haven’t found something this powerful for ReScript yet).
  6. Elm has no side effects. This makes practicing Test Driven Development a lot easier because no Mocks/Spies are needed in unit tests because all functions are pure. You should still avoid using String and utilize fuzz/property tests where you can.

Cohesive Architecture

Building a full-stack web application in the REGAL stack is typically 3 steps, often done in various orders, multiple times.

Domain Modelling

You model your domain in GraphQL by creating queries, mutations, and types in your shcema.graphql file. This is not set in stone and will change as you learn.

Build & Deploy Your BFF

You deploy your schema to AppSync using Serverless Framework. You write your Lambdas in ReScript to take in typed queries using Jzon, and export out typed GraphQL responses using Jzon. You link these Lambdas to your GraphQL queries and/or mutations via AppSync. Your UI makes HTTP GraphQL calls to your AppSync’s CloudFront URL or Route53 URL. These calls are authenticated by a header token, JSON Web Token, or another Lambda.

Build & Deploy Your UI

In your UI, run elm-graphql to generate all the GraphQL Elm code needed to interface with AppSync in a typed way. You make your Elm GraphQL calls that return data you need to populate your Elm Model. You deploy your working UI multiple times a day by checking into Github/Gitlab. This triggers Amplify to start a build using CodeDeploy. CodeDeploy has a built in Cypress container. This runs your elm test, elm review, and Cypress end to end tests and shows the results of passed tests and downloadable videos. If successful, it’ll deploy to S3 and cache bust CloudFront. You can access this UI via a CloudFront URL or Route53 URL.

Schema Change

GraphQL ensures the types on your UI and API are the same. This is a good and bad thing in compiled languages. The GraphQL spec says GraphQL API’s do not have versions, and GraphQL should be backwards compatible. This is insanely hard to accomplish, and I don’t agree with this philosophy, but have made the compromise because I love the Correctness philosophy gains more then some of the CICD sins we commit. This means as long as you add fields, and add queries and mutations, you’re fine.

As soon as you need to change the name or something, delete a field, or change a field… you’ve broken the API. This means you first need to change your Lambda(s) that use this new data, and ensure your unit tests (using ReTest) pass, and your integration tests (using Mocha & JavaScript) which invoke your Lambdas directly on a QA server still work.

This inevitably will then require you to fix the UI. You do this by running elm-graphql again against your updated schema.graphql file, and the Elm compiler will let you know what you need to update.

This “change GraphQL, fix tests, generate Elm code, fix compiler errors” is a common iteration pattern as you build and learn. This is much easier to do in a monorepo, but may require staggered deployments as your domain model changes calm down. Yes, using staggered deployments like this does NOT follow the CICD philosophy. You can either get your GraphQL right the first time (good luck with that), utilizing a monorepo using Amplify CLI, or Serverless multiple-deploy apps (more on this below), or just suffer through it knowing the compilers have your back (exception for the integration & e2e UI tests… those are in JavaScript, and types really won’t help you here). We recommend a monorepo to solve these problems (I can’t at my job yet, more on that below).

Deployment

UI deployments are a pretty simple affair. Write some code, check it in, and the Amplify CodeDeploy setup will run your tests, and if they pass, deploy. While Amplify can independently deploy to multiple environments based on branches, if we’re following Trunk Based CICD, we can’t do that. Instead, we make a completely separate Amplify stack, 1 for QA, 1 for Stage, and 1 for production on a separate AWS account. At my job, we have QA, Stage, and Production stacks and all pull from the same Gitlab repo. QA and Stage run and deploy at the same time. Production does not have Amplify’s auto-build turned on, and must be manually deployed via a single click in Gitlab pipeline (it just makes a curl call to Amplify to start a build). Amplify CLI handles this all differently and assumes you’re doing feature branches. While you get a monorepo, we violate the Trunk Based CICD philosophy.

API Deployments can use whatever build system you want; CodeDeploy, Jenkins, whatever. We use Gitlab pipelines at work. It has the same setup as the UI; lint, test, integration test. The difference, however, is we deploy to QA first, run integration tests on it, and if those pass, only then do we deploy to Stage. This ensures that Stage remains stable for Product Owners who wish to show the UI or test it themselves without asking a developer “Is there a stable environment I can use?”. Since UI changes rarely break anything, and API’s do, we have a more stringent setup here.

Rollback

While Amplify encourages a fail forward methodology, we use Gitlab pipelines to rollback. All the .gitlab-ci.yml file does is “serverless deploy”; since we only ever change Amplify in the beginning of the project, this Gitlab pipeline doesn’t really do anything. However, you CAN redeploy an older pipeline, and we use this in case a deploy does break something in production and we need to rollback easily.

For API, same thing; each deployment uses a Gitlab pipeline on a commit, so if one is bad, we can rollback. However, we encourage the built-in Lambda green/blue deployments so you never need to rollback.

Options

Below are various options you use to modify how the REGAL architecture can work.

Monorepo

You have a few options here. We use 2 repos at work, 1 for AppSync deployed via Gitlab pipeline and 1 for Amplify where Gitlab just sls deploys and CodeDeploy handles the rest. Staggered deployments do happen if you don’t know your domain model yet, so it encourages you to figure it out early. A monorepo would alleviate this concern and allow you to deploy late into the project while still reserving the right to change your GraphQL API’s.

The Amplify CLI uses this methodology and keeps your schema.graphql in the same code base so both your UI code and API code a single source of truth AND can share code, making the colocation problem of DRY in microservices much easier. We cannot use Amplify CLI at my job because it requires creation of permission admin IAM Roles.

Serverless Framework and AWS SAM both have the capability of using nested applications, yet deploying them as a unit in a single CloudFormation deployment. This would allow both your UI code and API code to co-exist. I abandoned this with AWS SAM because I couldn’t get it to play nice with the ReScript compiler where in Serverless Framework “it just worked” (sls deploy function doesn’t, though).

CSS Framework

We use Tailwind at work because since 90% of your time in Elm views is building HTML tags, the Tailwind classes go right on the HTML, and this makes it super convenient to work and design in Elm. Also, the Tailwind v3 compiler is milliseconds fast which follows the Fast Iteration philosophy for design.

Elm makes no judgements on how you do CSS, or what framework you use. Some don’t even use CSS.

Route53 vs CloudFlare

In the beginning of the project, I’ll just use CloudFront URL’s for both the UI and API until I’m ready to integrate them. I try to integrate them as soon as possible, even if just using Mocks because I believe All Up Testing is important. As soon as possible, I’ll setup Route53 URL’s to avoid CORS and make it easier to share URL’s with teammates and Product. Eventually, however, we’ll move everything to CloudFlare, and ignore the Route53. This is reasons I’m not smart enough to describe; the tl;dr; is CloudFlare is awesome at routing traffic securely.

AppSync Queries Only

Almost every tutorial on AppSync talks about “only requesting the data you need”. They’ll show a GraphQL query that returns a JSON Object that has 11 fields, and they’ll only fetch 3 of them.

However, when you’re building a BFF for a UI you’re also building… you know the data you need. You’re the one who built the actual GraphQL schema. You’re the one who defined the types coming back from those queries and mutations. So you’ll notice that your AppSync is usually just linking queries and/or mutations to Lambda functions… and that’s it. You don’t really care about “linking the firstName field of a Person type to this DynamoDB over here…” because… that’s not what happens.

Your UI makes a GraphQL call “getPerson” and passes in a string ID… and your Lambda gets that call, calls some microservice/database directly to get the JSON, and massages it to match what the GraphQL says a Person looks like, and returns it. There’s no need for all that selection syntax. Yes, in Elm, you’re still required to write selection syntax… but typically you either automatically succeed with the types you’ve already generated, or you make your own UI type and map to that.

Again, we’re treating GraphQL like it’s REST: making GET’s or POSTS. The only difference is is GraphQL ensures all the types match up and we don’t need to worry about conversion bug nonsense. All that “selecting only a subset of fields” stuff is for when you’re accessing a GraphQL API built by some other team, or it’s a database with a Hasura API and you’re getting only what you need. That’s not what we’re doing here. We’re building a back-end that gives exactly the data our UI needs.

However, it’s AppSync. You’re more than welcome to configure it out however you wish. Just don’t get weirded out when your queries give you data, your UI uses, and people go, “Dude, why are you always getting all the data?” You can respond “Because my UI needs it. I’m a Full-Stack Dev, beeootchhhh!!!”

Elm GraphQL Type Mapping

The elm-graphql library has done a wonderful job keeping the GraphQL selection syntax in a type safe way. However, it’s super common to duplicate the types you’re requesting in Elm so you can modify them later for UI concerns the back-end doesn’t have. Sometimes they’ll never change, and that’s not duplication or a violation of YAGNI… it’s ok.

For example, if you have a type Person in your GraphQL, then generate code in elm-graphql, you’ll probably end up with a package like Api.Object.Person. However, your UI may need to add like `selected : Bool` to it if it’s in a List, or some kind of statefull variable like RemoteData if you’re updating that particular person. So you’ll have your own person defined in Elm:

type alias Person = { id: String, firstName: String }Code language: JavaScript (javascript)

Then you’ll write the mapping code for it… and suddenly get confused and ask yourself “Why am I not just using the Api.Object.Person itself? Why do I need to select data off of it? You CAN do that, for sure, it’s not very well documented. Again, this is the bias of GraphQL being used by front-end developers accessing a mess of data on the back-end. The marketing doesn’t cater to people like us building our own API’s. If you’re familiar with Domain Driven Design, think of Elm being its own bounded context. Yes, it’s got a clear type defined what a Person is on the back-end, but it’s creating it’s own just in case it needs to add UI concern data to its version of Person. It can then map back to a Api.Object.Person if you need to make a mutation call.

Other Tech Stack Comparisons

The following talks about other popular web application tech stacks and why REGAL doesn’t use their tech.

ReScript React

ReScript is quite popular with the React crowd. ReScript’s pattern matching syntax works quite well when determining what state to draw in your React’s JSX. To me, given type definitions are mostly invisible in ReScript, it’s also much more readable than the verbose TypeScript alternatives like Relay.

However, 2 problems here. First, React comes from OOP roots. Their components are class based. This isn’t functional first. Second, React hooks aren’t really pure functions and are super hard to test, have wild foot guns (like useEffect running twice), and result in lots of nested closures. ReScript can’t fix those React specific problems.

Phoenix Framework

Elixir doesn’t have types. Elixir yearns for the Erlang days of being a super fast, live deployment of real code, nowadays on EC2’s or at the very least ECS/K8. I’m not a state and server fan. Elixir also has runtime exceptions. Yes, the “let it crash” philosophy is baked into the UI, but… what if you never had to crash? That tech exists. It’s called Elm. What if you could also deploy that UI and never worry about the server going down “because Jeff Bezos handles it?” (i.e. Amplify)

Svelte/Vue/Angular

All are heavily procedural or Object Oriented, have runtime exceptions, and are hard to test. Additionally, all have framework fatigue on what router and state management solution/library you use. In Elm, you “just use the Elm architecture”.

Future Improvements

A few things, both unknown, and known, coming in the future.

Monorepo Improvements

Having a recommended way of doing monorepos would be nice. While having the GraphQL schema in 2 repos isn’t the worst thing in the world, you do waste time if you forget to copy paste when you make a change. Amplify CLI will have to figure something out if it wants those of us who can’t create God Powered IAM Roles and grant those powers to 3rd party CLI’s. Serverless has made headway here using multi-service deployments. We should be able to change a GraphQL schema with a breaking change months into a project, and leverage the super powerful, fast, and awesome Elm and ReScript compilers to help us… and not have to do a staggered deployment.

Code Deploy Cypress Mystery Failures

CodeDeploy will run your Cypress tests using it’s built-in Docker container, show all tests as passed, then say they failed. This blocks deployments. We have to basically comment out the test phase to get code to QA/Stage/Production. This has been going on for months. Sometimes it’ll last a day, magically remediating itself the next morning.

Changing the R from ReScript to Roc-Lang

I want Elm on the server. Roc-lang delivers. Maybe by this fall, I’ll test out some custom runtime Lambda deployments and see how it goes.

ReScript GraphQL Generator

Having the ability to generate Jzon encoder/decoders with types for GraphQL would hit a massive productivity sweet spot we’re missing on the server. For Elm, it’s unbelievable how trivial updates have become with code gen + compiler powers whereas with ReScript, you have to run manual integration tests ON a deployed Lambda to see if it works or not. Most of the generators I’ve found are generating queries, not actual code, decoders, encoders, and types like elm-graphql does. I’m sloooowwwlly getting smart enough about how shadow types work in ReScript to attempt this, but I’m a bit loathe to do so knowing Roc-lang would possibly be a better investment of my time.

Conclusions

The REGAL tech stack has allowed us to build and deploy 2 full stack web applications in under a year. We’ve had zero reported runtime exceptions for the UI in production, and all unintentional runtime exceptions for ReScript Lambdas have been microservice data changes which Jzon has some pretty good runtime parsing exceptions much more readable than Elm’s, connection issues that weren’t our fault, or JavaScript integration that was… well, both my + JavaScript’s fault… because JavaScript.

Allowing us to quickly iterate, with confidence, has significantly improved my self-esteem, lessened change anxiety, and allowed me to learn more about CICD and testing in a safer way while still delivering business value.

ReScript’s compiler has allowed me to quickly make data changes throughout our BFF as the microservices data model changes or as my understanding improves. Elm’s compiler allows the entire team to make sweeping data and functionality changes, all with the confidence that if something breaks it’s “always the BFF’s fault”. Amplify & AppSync have simplified hosting a web application on AWS, allowing us to focus on code vs. infrastructure. GraphQL has allowed us to model with confidence, and as we learn and change things, we know the compilers have our back to make those changes without worry we’ll break something unintentionally.

The REGAL philosophy of fast iteration, correctness, functional thinking, serverless first, and trunk based CICD has really shaped how I build software now and made things more enjoyable over all.

Remember, you don’t have to adopt all of the REGAL tech stack or philosophies to have something benefit you; just pick 1 to play with and see how it goes.

2 Replies to “The REGAL Architecture”

  1. Amazing article. Thank you.

    I’d love to know more about how Haskell or Rust would affect this stack and why you didn’t chose them.

    1. My issues with Haskell are super small, and shouldn’t dissuade people from choosing Haskell over ReScript:
      1. `no native Lambda support`. This requires a custom runtime for Lambda. Not the worst thing as I’ve tested Haskell for Lambda in the past and like F#, it totally works.
      2. `low-level` Haskell is like Rust in that it’s _mostly_ for lower-level stuff. Like Rust, you can get some terse, fast code if you’re “just” loading JSON and parsing it which one would think would be a great target for Web API’s in Lambda.
      3. `server prejudice` Haskell, like any lower level language, is typically viewed as an EC2 target, meaning, it’s assumed you’ll be writing some kind of server or logic thing that is run in parallel on something like ECS or K8; compile, docker build, test, deploy. The idea of having “many little Haskell” programs hasn’t permeated the tooling setup like it has in JavaScript/Python. Cabal and other tools are for “building your app” not “building 1 or many of your little functions”. I suck at Cabal so maybe there has been progress there when I last tried 2 years ago.
      4. `different runtime`; having “everything be Node.js” simplifies the “whole stack is Node.js”. Yes, Elm and ReScript aren’t JavaScript, but they all run on Node.js, so having the ability to check out a project and go `npm i` is huge. Node.js is so far, the easiest to install on people’s machines with little to no configuration (_maybe_ like 1 entry into your .npmrc) so you avoid the whole “works on my machine” or the horrible experiences of setting up your development environment (a la Python). This pays dividends, too, on integration. AWS SDK and libraries heavily support JavaScript like Metrics and X-Ray support with little to no configuration in code, so I get a lot for free here despite not actually using JavaScript. At my job, we prefer Dynatrace and StatsD to X-Ray, and for that I can swap out using npm’s hot-shots for example. Stuff like that that leverages the community of libraries is just super productive. Same for ReScript’s ability to integrate with these libraries if need be.
      5. `category theory` I thought I’d enjoy learning about Monads, Functors, etc, but I didn’t. While I like Math, I like Engineering more, and a lot of Haskell’s core revolves around CT. I’ve seen a lot of Haskell code that has zero to do with those things, and looks almost super similar to Elm “just for a simple Lambda”, but that’s sadly not the norm. I just want to write code, not suddenly glean my map function isn’t obeying functor laws, or I need a State Monad to emulate how I can just keep passing back tuples via Promise.all in JavaScript which is insanely easier. Maybe someday that’ll change, but the learning curve didn’t seem to have that much of a payoff. Yet?

      Both Haskell and Rust have notoriously slow compilers. Someone once told me the older I get, especially with kids, I’ll learn patience. The opposite has happened; I don’t have patience for slow compilers.

      My issue with Rust is it’s, again, super low-level. I work for a fintech, sure, but JavaScript basically has 1 number type (not including BigInt) whereas Go has 11, and Rust has 4, ReScript has 2 (again, excluding BigInt). This flexibility can make your code super fast, AND super correct. Creating types for Annual percentage rates, for example, is tough when the UI shows one way, but the database stores another, and you need to format back and forth. Much easier in a language that supports all those low-level integer math things, I’m sure. However, it also makes more crap you gotta think about, and having just 2 number types is just simpler. Again, we’re doing simple calls to other microservices, JSON parsing, and a teency bit of if/then things; I’m sure Rust would be faster, but 90%+ of our time is spent in I/O land, not running code.

      If, however, you’re good at Rust, I could totally see it working the same way we have many Go lambdas doing super fast serverless processing things. I just like correctness over speed and that passion hasn’t really slowed my app enough to kill UX yet.

Comments are closed.