Preemptive Pluralization is (Probably) Not Evil
What if we just assumed we might have two of everything? #reflections #programming #advice
Read time: 5 minutes Published:
Before you write any code — ask if you could ever possibly want multiple kinds of the thing you are coding. If yes, just do it. Now, not later.
A few common examples to illustrate:
- You assume that one team has many users, and correspondingly, one user belongs to one team.
- Eventually, you find that a user may need multiple teams.
- This is actually fantastic for your business!
- But you are depressed because you now have to spend 2 months refactoring every line of code and database schema that assumed the one-to-one mapping 😱 More examples here.
- You assume that loading state only has two states — true / false — so you make a boolean
- Then you realize you need to track error state, so you do, with
isError. You do some work to make sure the 4 combinations of states behave intuitively. You write tests for each of them like the good developer you are. Of course.
- Then you realize you need an
isCanceledstate. 8 and counting...
- Eventually you realize every addition or modification takes exponentially longer to account for each edge case. It's hard to even tell if you've covered them all. Most don't.
- The solution is explicit state machines - but at this point you're too far in to justify a rewrite.
- Then you realize you need to track error state, so you do, with
- Internationalization. If you winced at that, you know the pain.
- Pagination. To quote Simon Willison, co-creator of Django, "Refactoring an existing non-paginated API to support pagination will break everything. Better to fake pagination but only ever return a single page, just in case".
You can listen to Ben Orenstein of Tuple discuss this on my mixtape:
I've done this refactoring a million times. I'll be like, I thought there would only ever be one subscription team, user plan, name, address , and it always ends up being like, "Oh, actually there's more." I almost never go the other way. What if you just paid the upfront cost of thinking "This is just always a collection"?
Donald Knuth is famous for noting that Premature Optimization is the Root of All Evil (there's some nuance to that, btw). I am very sympathetic to the appeal to simplicity — if "You Ain't Gonna Need It", then don't use it. But I think Preemptive Pluralization — projecting forward into hypothetical situations when you need N types of a thing — is exempt, even though you are literally optimizing for a future you don't currently live in.
It is a LOT easier to scale code from a cardinality of 2 to 3 than it is to refactor from a cardinality of 1 to 2. This is a fundamentally under-appreciated nonlinearity. In other words, Preemptive Pluralization can make the difference between "sure, I'll add that today" and "this is going to take us 2 months and we'll introduce merge conflicts with every other in-progress feature."
Requirements volatility is a core problem of software engineering. As a software engineer, writing code that does what you ask of it today is the bare minimum. Your real skill comes in what happens next — what you do when requirements inevitably change, whether by new feature request or scaling issues arising from I/O or compute bounds.
It may not be enough to write code for what you foresee in the near term — those are just more requirements. Software design and architecture is all about making it easy to respond to unforeseen changes.
Hillel Wayne has proposed calling these requirement perturbations. If a small, typical feature request can throw your whole design out of whack, then you have fragile code. Clearly you want the opposite of fragile — I am tempted to call it "Antifragile" because that gets clicks — but really the best you can hope for is code that mostly doesn't fall apart due to 1-2 standard deviation changes in requirements. In other words: robust code. Robust code is optimized for change (more in a future blogpost).
The nonlinearity in how expensive it is to make a change comes from the "emergent sclerosis" of code. Code that is robust to future changes is far cheaper to write today, than when written later, as delayed technical debt that you must pay up before you can proceed to a feature request. Fragile code is like the payday loan lender of technical debt.
Preemptive Pluralization creates Robust Code.
I'm so committed to not prematurely optimizing that I want to make a final pitch for why Preemptive Pluralization is not premature. Let's address two obvious criticisms of Preemptive Pluralization:
- Increased code complexity: Functional languages and other abstractions can help make array or matrix operations almost as easy to work with as regular operations.
- Slow performance from doing extra loops: Loops only cost significantly when you have lots of N. By definition, if you are pluralizing prematurely, N = 1.
- Other concerns raised by readers: Perf bottlenecks from excessive joins, Code Communication
Ultimately I think what makes something premature or not is your definition of what you need to write. If you view "code that works today" as your job, preemptive pluralization is premature. If you view "code that doesn't blow up in my face a year from now" as your job, then it is not.
Make Robust Code a design requirement from the start.
More awkward things to pluralize:
- Single tenant open source -> Multi tenant hosted service
- Versions -> from no version to v1/v2, or going from "legacy"/"new" to "new new" (hence Stripe just uses dates)
- Number of independently shipping frontends in your company (hence module federation)
- Number of clouds in your company (you think you will avoid this... until you can't, per the Hashimoto lemma)
More from @nivertech on Twitter:
- single node→cluster
- single language→i18n
- no pagination→pagination
- single user→multi-user
- single tenant→multi-tenant
- single branding→white-labeled
- hard-coded configs→separate config from code
- hard-coded features→feature flags
- no 3rd party API→API-first
- local desktop SW→Client/Server
- Timestamps! you might as well turn your boolean into a timestamp.
deleted_at. But also
created_by, and more
- UUID? Version? - don't use the UUID as a PK.
- Thanks to Jon Wong for reviewing a draft of this post and contributing the boolean and internationalization examples
- Endorsements & Horror stories from Simon Willison, Ben Orenstein, Andrew Culver, Andrew Ingram, Tanner Linsley, Ryan Murphy, David Welch and Khrome!
- Daniel Buckmaster compares this idea to Sandi Metz's POODR.
- Martin Gronlund compares this to data oriented design.
- Jared Palmer and Daniel Yokomizo mention the Zero one infinity rule - though this article argues FOR preemptively favoring "infinity" over "one"
Disclosures: This is an idea I have been mulling for a while, but have not practiced at scale. Written in a couple hours after a prompt from a podcast.
Join 2,000+ developers getting updates ✉️
Too soon! Show me what I'm signing up for!