When it comes to companies rolling their own custom chips, our core thesis is that doing this to save a few dollars on chips is breakeven at best. Instead, companies want to build their own chips when it conveys some form of strategic advantage.
The textbook example is Apple, which ties its chips to its own software to meaningfully differentiate their phones and their computers. Or Google, which is customizing chips for their most intense workloads like search algorithms and video encoding. A few hundred million dollars in chip design costs are more than paid pack in billions in extra sales for Apple or billions in capital expenses and operating expenses savings for Google. It is important to point out that in both those cases the company completely controls what software is being run on its homegrown chips.
Editor's Note:
Guest author Jonathan Goldberg is the founder of D2D Advisory, a multi-functional consulting firm. Jonathan has developed growth strategies and alliances for companies in the mobile, networking, gaming, and software industries.
So what is in it for Amazon?
For Amazon, and more specifically for AWS, software control is beyond them. AWS runs everyone else's software, and so by definition, AWS cannot control it. They have to run almost literally every form of software in the world. Nonetheless, AWS seems to be working very hard to push their customers to run workloads on their Graviton CPUs. AWS has many ways to lock customers in, but silicon is not one of them. At least not yet.
AWS is probably not doing this to save money on the AMD and Intel x86 CPUs they are buying. The fact that they have two vendors alone means they have ample room for pricing leverage. To some degree, Graviton may be a hedge against the day when Intel stops being competitive in x86. (A point we may have already reached.)
That being said, we think there is a bigger reason - power. The chief constraint in data center construction today is electricity. Data centers use a lot of power, and when designing new ones, companies have to work around a power budget. Now imagine they could reduce power consumption by 20%, that means they could add more equipment in the same electricity footprint, which means more revenue. A reduction in power consumption by one part of the system means a much higher return on the overall investment. Then multiply that gain by 38 as the savings percolate through all of AWS' data global centers.
Now of course the math is a bit more complicated than that. CPUs are only part of a system, so even if Graviton is 20% more power efficient for the same performance versus an x86 chip, that does not really translate into 20% more profit from the data center, but the scale is about right. Switching to an internally designed Arm CPU can generate sufficient increase in datacenter capacity to more than offset the cost of designing the chip.
Taking this a step further, one big obstacle that prevents more companies from moving to Arm workloads is the cost of optimizing their software for a new instruction set. We have touched on this topic before, porting software can be labor intensive. AWS has a big incentive to get their customers to switch, and seems to be doing what they can to make this process easier. However, we have to wonder if this is something of a one-way street.
Once customers make the switch to Graviton, that just shifts the friction. As we said above, today AWS cannot use x86 silicon to lock their customers into their service, but once customers switch to Graviton all that optimization friction shifts to work in AWS' favor, creating a new form of lock in. Admittedly, the barrier today exists between Arm and x86, not among the various versions of Arm servers. But one of the beauties of working with Arm is the ability to semi-customize a chip, and so it is entirely possible that AWS may introduce proprietary-ish features in future versions of Graviton.
We think Amazon has many other good reasons to encourage the move to their Arm-based Graviton CPU, but we have to wonder if this lock-in is not lingering somewhere in the back of their brains. If true, that just gives the other hyperscalers more reasons to shift to Arm servers as well.