My current gig at Together.ai is my first true AI gig. Over the past five months, I’ve had the opportunity to meet people across teams, startups, and incumbents in AI, and across the board a few things are proving to be surprisingly consistent. None of these were obvious to me going in.
- The space is chaotic. A new research paper, a new product, or a new acquisition can swiftly change the course of your roadmap or even your company. Reacting quickly and deciding when not to react requires more judgment than ever. A bad decision here will make itself known very quickly since the feedback loops in AI are remarkably short.
- Traditional blue chip companies tend to be much further behind in AI adoption. If you weren’t born an AI-native, it is extremely difficult to hire the right talent and consequently build the right AI products. The lack of a playbook forces experimentation for nearly everything in the AI infrastructure stack right now; which means that you have to be open to trying things and also be willing to fail if they don’t work as expected. That level of risk is too high for most teams to take when the status quo is “just fine”.
- If you’re building an AI startup, even if you aren’t training a new model, the amount of compute you need blows up the more successful you are. Since no two tokens are created equal, token variability forces you to care about user behavior in a new way. Prompt length, retries, streaming, tool calls, and defaults are UX decisions with direct impact on costs. Many teams discover too late that they designed a product users love and cannot afford to run. Additionally, you should be extremely careful about who you partner with for compute and make sure their incentives are aligned with yours. It only takes one pricing change to destroy your margins.
- I’ve written about this briefly but evaluating models isn’t enough. There is a wide disparity in model performance across inference providers caused by a multitude of reasons. Aggregators like OpenRouter, Vercel AI Gateway are amazing but they can show uncharacteristically great or uncharacteristically bad performance when it comes to latency, throughput or tool calling depending on a variety of factors, most of which are out of their control. If you are at a company seriously evaluating providers and expect some usage at scale, actually talk to sales so you can get enterprise-level support. Otherwise you’re missing diamonds in the rough from a cost and performance POV.
- I’m much closer to hardware now than I ever was, and honest to god GPUs are some of the flakiest hardware I have ever seen. Their ability to fail in new and unique ways does not cease to surprise me. On the plus side, AI natives know this better than most so the tolerance for unreliability, at least on the infrastructure side is much higher here than anywhere else I’ve seen before. But the reality is that reliability starts to matter more than raw capability faster than people expect. Yes, early excitement now is driven by “can it do the thing” but as the technology matures, the real question becomes “will it do the thing every time, predictably, under load.” I expect this to become a focal point across all the compute providers in 2026.
What’s become surprisingly clear to me overall is that inference, cost, and reliability are now shaping AI products more than model capability does.