Type Driven Development: Parse, Don’t Validate

Part 14 – Parse, Don’t Validate

This is a series of posts I’m writing about using types as another tool in software development, Continuous Delivery, & keeping LLM’s honest. They’re also a design & refactoring tool, a communication tool, and reduce how many tests you have to write.

Parts

Parse, Don’t Validate is a best practice pattern to avoid Shotgun Parsing. Shotgun Parsing is when data in your domain cannot be trusted, so you validate it just in case before using it & code throughout your codebase also does ad-hoc validation:

if(!person || validEmail(person.email) === false) {
  throw new Error('person is either undefined or email is invalid')
}

// safely use email hereCode language: JavaScript (javascript)

Other code just checks if the accountType is what it needs:

if(typeof person.accountType === 'string'
  && person.accountType !== 'primary' ) {
    throw new Error(...)Code language: JavaScript (javascript)

Sometimes the validations are the same, hence validEmail being a re-usable function. Sometimes they’re unique, e.g. ad-hoc checking of accountType, but assuming person is not null. Was this intentional? Was there assumed order of operations that already validated person wasn’t null creating coupling? Most of these validations just check the part of the data their code cares about, ignoring the other parts. This spreads into multiple functions/methods & files.

This results in 2 new problems. The program can crash at any time b/c some of the data may be invalid, but only for a particular piece of code. When it fails, u don’t know if any of the other data is ok, even for code that may be 2 lines below; the error only talks about what data it cared about. A huge percentage of ALL code, domain + infra, new code & old, has this data validation.

Solution? You parse the data you’re getting into the data you need.

type ParsePerson = (data:unknown) =>
  Result<Person, Error | ZodError>Code language: JavaScript (javascript)

It’s easiest to do this using schemas & we use unknown instead of any to ensure the compiler forces us to type narrow + convert to safe values.

This solves a bunch of problems:

no more Shotgun Parsing; all code now has valid data once successfully parsed
no code needs to “check if the data is valid”; if they are using a Person type, it’s valid
your program cannot crash at any time
validation happens through parsing, done in 1 place, not ad-hoc all over the code base
defensive code no longer needs to be written nor maintained
when it fails, you get details what worked & what didn’t parse correctly (e.g. Zod / schema parsing errors)

Design the types you want, convert unknown data into those types. Optional values driving you crazy? Make it required. Data not ready? Make it a Discriminated Union, only access it when it’s ready. Gone are the days of “Can I trust this input?” all over. Ask that in 1 place at the boundary of unknown -> YourType. Anytime u start wanting to validate data, think if the JSON/file/database parsing part of your code could do more to make the data valid by the time it arrives in your code.

Part 14 – Parse, Don’t Validate

Comments