Skip to content

Thoughts on value accrual in infra

Published: at 06:25 PMSuggest Changes

It’s fascinating to see where value accrues in the tech infra stack. As far as I can tell, accrual patterns seem to be fluid across time and verticals.

I think we’re seeing an example of this transition of value with the “commoditization of IaaS” i.e. the quarter-life crisis of cloud. Publicly available data is scarce here but Corey Quinn at the Duckbill Group estimated that more than 50% of AWS revenue comes from EC2 with a gross margin percentage in the mid-50s. However, there is only so much efficiency and margin clouds can eke out of their VM/instance offerings.

In 2015, it was estimated that 14% of AWS revenue came from higher-margin (I’d wager mid-70s at least) “platform as a service” products, while in 2020, analysts estimated that the platform percentage had risen to 18%. There is an entire ecosystem of products and services (Lambda, Firecracker, Cloud Run etc.) in the PaaS layer of cloud that are driving revenue growth and margins. Sure, it’s all compute consumption in the end but it shows the willingness of users to pay for differentiated, higher-level services, even if they are more expensive. The value is moving up the infra stack.

In contrast, a recent blog called “Who Owns the Generative AI Platform” by a16z asserted that for generative AI, the real winners emerging are the infrastructure providers at the bottom of the stack. All AI models have to run in a GPU or TPU somewhere, and by a16z’s estimate, at least 10-20% of total revenue in generative AI today goes to the cloud providers/platforms (as shown below).

I’m not surprised by this, the cost of compute is really the limiting factor for scaling these models. I’ve seen usage patterns from internal compute platforms at Google and damn, training models are thirsty. If idle compute exists, it will get used. Diurnal patterns and provisioning for peaks really work to Google’s advantage here–which is why efficiently utilizing available capacity is a key in Borg.

In the case of generative AI at this moment in time in 2023, it seems like infrastructure at the bottom is winning. Will this change over time? Perhaps. As I was writing this post, I fortuitously stumbled on a tweet highlighting this chart from a post by USV called “The Myth of the Infrastructure Phase” from 2018.

The premise of the article is that “First, apps inspire infrastructure. Then that infrastructure enables new apps.” This is how AWS was born, and this is how containers and Kubernetes took off. Even with AI, Google created TPUs to accelerate machine learning for better search and ads. Now this infrastructure is being leveraged by generative AI.

With the number of AI apps emerging, I believe there is imminent opportunity for more vertically integrated, plug and play infra. This already exists to some degree today at the big three (I’m thinking AWS Sagemaker, Azure ML, Google Vertex AI) but we’re also seeing new contenders like MosaicML emerge (they run multi-cloud plus have their own infra).

In conclusion though, it feels too early to say how this will all shake out and where the value ends up going. There is a lot of excitement about AI-powered apps and use cases but as an infra person, I’m watching how the underlying compute platforms evolve closely.


Previous Post
Where have we landed with Kubernetes?
Next Post
Choosing between abstraction and customization as a PM