The God Endpoints will continue until morale improves

swyx 2022-09-05

~~GraphCMS~~ Hygraph calls it a Federated Content Platform:

Gatsby calls it a Content Mesh:

Apollo comes right out and calls it a Supergraph:

This isn’t new, Matt Slotnick perceptively called it the hardest working graphic in software:

and the GraphQL devotees have their own data engineering parallels in Airbyte and Fivetran.

Thoughts

The value proposition is evident - IF ONLY we can get all these messy things to conform to one interface, developers will be able to query them and resolve across them faster, improving developer experience, blah blah blah. This problem increases nicely with the size of the company, nobody wants to do it, it’s essential for operational and analytical needs, making it a nice meaty problem for startups to go after. Strategy 101, or rather, Strategy Letter V.

But there’s a lot in that IF:

Maintenance cost: Interfaces break all the time and run into edge cases/perf issues, averaging (1-3hours? its very spiky, averages are kind of meaningless) a week of maintenance
Lockin: People are locking themselves into your SDKs and APIs, and given that none of these God Endpoints have yet stood the test of time (even GitHub has not managed to make GraphQL default/easy), it is a risky proposition.
- In this sense the data eng companies have it easier, because they integrate data, not code, which lasts longer.
Incentives: why should the data silos want to let you extract their data? why can’t THEY too create a God Endpoint for their users? obligatory https://xkcd.com/927/ reference (bonus points if you know what that xkcd is without clicking)

Standards as God Endpoint

Each year hundreds of millions of dollars are thrown into solving this stuff, both in-house and in vendors. It is probably necessary work, it is messy work, and it is unrewarding in the small/only lasts until the next Big Bang Rewrite for the New God Endpoint flavor of the decade.

It feels inelegant though. We are brute forcing this problem by throwing endless bodies and time and money at it but this doesn’t solve it like email and terminal outputs and HTML have been “solved”.

What needs to happen is standards - that data producers and data consumers and all in-between tooling can optimize to, that increase the user trust in betting on these. In terms of recent examples, I am inspired by Opentelemetry, which although it did not have a rocketship outcome for Lightstep, seems to be successfully defining a telemetry interface that all producers and consumers are now accepting.

Sometimes standards are designed by committee (in JS, WinterCG is particularly interesting right now, but lacks teeth), sometimes they are decided by an extremely dominant player (JSX, S3, OCI and Postgres are examples from various domains) that has essentially “won”. Of course this begs a question - since it isn’t necessary to establish a common standard before “winning”, is it even helpful to try? My sense is winning without standards and winning with are comparable to “winning the battle but losing the war”.

Language as God Endpoint

Of course, this being the Age of LLMs, no blogpost would be complete without considering the first interface evolved by humanity: natural language.

In other words, even standards have problems - they require a learning curve, they may have design flaws, and they aren’t flexible (almost by definition - the more flexible a standard, the less useful/reliable it is). Standards optimize for machine communication, but Languages optimize for human communication.

What does this look like, potentially?

Instead of SELECT COUNT(*) FROM users INNER JOIN charges ON [users.id](http://users.id/) = charges.user_id WHERE users.email = 'joe@freshpizza.com'
we might write/speak/think: how many payments has Joe from freshpizza.com made?

If you are implementing a system like this, please also implement partial information resolution:

Q: how many payments were made?
A: Insufficient information - how many payments by whom? over what time period?
Q: oh sorry - by ’joe@freshpizza.com’
A: ok, looking for the count of payments made by 'joe@freshpizza.com'… over what time period
Q: oh sorry again - over the last 3 months!

That kind of thing, but for API tokens, logins, and other missing info (I covered some of this in my 2019 Adaptive Intent-based CLI State Machines talk, informed by experience from my 2016 Alexa skill). Because this is a human-in-the-loop process, you’ll want a Temporal.

Isn’t this just chatbots? I’ve now been in tech long enough to remember the previous time conversational commerce was hyped and went nowhere (arguably arguable), and AI assistants were going to take over the world. So yes there’s a risk that we go through a whole bunch of ceremony just to reinvent Clippy 2022 Edition. But the volume of data on both ends, and the new usecases unlocked by better natural language understanding, perhaps makes it worth another shot.

Yes, this is a 1000x more expensive query to parse, but that could go down over time, and it could be more of a last mile thing to humans, but you could imagine a distant future where LLMs are cheap enough that systems talk to other systems in this same way - forever solving the problem of API or standards breaking.

Webmentions

❤️ 0 💬 11

Matthew Turland retweeted
Gabriel D'Nillo retweeted
Drake Morris mentioned this on 2022-09-06
Great tools! Have you come across ZapUp which automates everyday tasks just in a jiff? Why not give it a try? It might just turn out to be the best #WorkflowAutomation tool you've ever used in business.
Dante Lex retweeted
Roy Derks mentioned this on 2022-09-06
God endpoints, great term. I can see how much of this applies to companies, but given the pros and cons it remains a choice. A million different incoherent endpoints isn’t better. For federated GraphQL APIs you see companies having endpoints on multiple levels and optimizing each
Sam Bhagwat mentioned this on 2022-09-06
Valhalla is coming
swyx 🌴 mentioned this on 2022-09-06
hypothesis - content wants to be vertically integrated. hence wordpress domination. no nontechnical writer wants to futz with hundreds of lines of config and code and needs are more homogeneous than those other domains
Sam Bhagwat mentioned this on 2022-09-06
yes!!! funny enough the content web is actually lagging behind other eng fields here: - BigQuery, Snowflake and the data warehouse - Segment, Rudderstack and CDPs - Zapier/Mulesoft and service integrations
Sam Bhagwat mentioned this on 2022-09-06
yeah definitely segmented by site size/budget -- sites maintained by one freelancer devs or non-devs will likely stick w/ vertical integration for next 5ish yrs -- sites maintained by a team of devs + content editors are gravitating towards modular/composable/Jamstack