Uber is deepening its reliance on Amazon’s in‑house artificial intelligence chips in a move that underscores both companies’ ambitions in the fast‑evolving AI infrastructure race. The ride‑hailing giant has signed an expanded multiyear agreement to run more of its core AI workloads on Amazon Web Services’ custom silicon, including Graviton and the latest Trainium3 processors.
The new deal builds on Uber’s existing cloud partnership with Amazon, but marks a decisive shift in how the company plans to power the machine learning models behind its ride‑hailing, delivery and logistics platforms.
Uber will migrate more of its real‑time services such as matching riders and drivers, calculating arrival times and optimising delivery routes onto AWS Graviton4, Amazon’s newest ARM‑based server chip designed for high‑performance, low‑cost computing at scale.
At the same time, Uber will begin piloting Amazon’s Trainium3 AI chips to train and run large‑scale models used for demand forecasting, surge pricing, driver incentives, fraud detection and personalised recommendations.
The shift signals Uber’s intention to reduce its dependence on traditional GPU‑based infrastructure, particularly Nvidia hardware, while also cutting the cost of running increasingly complex AI systems.
Uber executives framed the announcement as a strategic decision to keep the company’s AI capabilities ahead of rapidly growing demand across rides, deliveries and new services.
An Uber technology leader said the company is “leaning into AWS’s custom silicon to power some of our most critical real‑time services,” adding that Graviton and Trainium3 will help Uber “run larger and more sophisticated AI models, while improving price‑performance for every trip and delivery we power.”
The executive noted that Uber’s systems handle millions of trips and orders every day, making efficiency gains at the infrastructure level particularly impactful.
“Every millisecond matters when you’re matching riders to drivers or couriers to restaurants,” the person said, describing the new hardware as “a key enabler for the next generation of AI‑driven experiences on Uber.”
For Amazon, the announcement is a high‑profile validation of its years‑long bet on building its own data‑centre chips rather than relying solely on Nvidia and other third‑party semiconductor suppliers. AWS has steadily rolled out three main families of custom silicon Graviton for general compute, Inferentia for AI inference and Trainium for AI training pitching them as more efficient and cost‑effective alternatives for enterprise customers.
Dave Brown, vice president of Compute and Machine Learning at AWS, has argued that a broader range of chips will ultimately benefit customers as AI workloads diversify.
“The variety of chips in the AI sector is beneficial,” he said recently, emphasising that customers are looking for “better price‑performance and more control over their AI infrastructure” as model sizes and usage continue to grow.
Uber’s decision to adopt Trainium3 and expand its use of Graviton gives Amazon a marquee reference customer for its in‑house silicon, particularly in latency‑sensitive, consumer‑facing applications where failures or slowdowns can be immediately felt by users.
Graviton4 will underpin a wider slice of Uber’s real‑time backend, including systems responsible for trip assignment, route computation and marketplace balancing between riders, drivers, restaurants and couriers.
These workloads require large volumes of compute but must respond in fractions of a second; by running them on Graviton, Uber aims to handle more requests while holding down infrastructure costs.
Trainium3 is designed for the more compute‑intensive tasks of training and fine‑tuning models that power features such as demand prediction, surge pricing algorithms and estimated time of arrival (ETA) calculations.
As Uber gathers data from billions of historical trips and orders, these models must constantly be retrained to account for changing traffic patterns, consumer behaviour and economic conditions.
Amazon says its latest AI chips can deliver substantially better performance per watt than general‑purpose processors and even some existing accelerators, allowing customers to train and run large language models and other complex systems more cheaply.
Its Inferentia2‑based Inf2 instances, for example, are marketed as delivering up to several times higher throughput and significantly lower latency for AI inference compared with previous generations, a pattern AWS is extending with Trainium3 for training workloads.
The expanded partnership also has implications beyond Uber’s own apps, feeding into the broader competition between AWS, Google Cloud and Microsoft Azure for AI business. Uber has previously worked with other major cloud providers as part of a strategy to move away from its own data centres, making it a closely watched customer in the industry.
By convincing Uber to scale up on its custom chips, Amazon is signalling that proprietary silicon can be a key differentiator in cloud deals, especially with customers running large, always‑on AI workloads.
Industry observers note that as AI spending becomes one of the biggest cost centres for tech platforms, cloud providers are racing to offer hardware that can deliver meaningful savings without sacrificing performance.
At the same time, Amazon has said it will continue to work closely with Nvidia, planning future AI servers that combine its own chips with Nvidia technologies such as NVLink Fusion connectivity.
That hybrid approach reflects how demand for AI compute remains strong enough for multiple chipmakers to grow, even as cloud companies push their own designs in search of better economics and more control.
For Uber users, the shift to Amazon’s AI chips will be largely invisible, but the company suggests it could translate into more reliable ETAs, smarter pricing and better‑matched trips over time. Faster and more efficient models could help reduce wait times during peak hours, improve route choices in congested cities and make incentives more targeted for drivers and couriers.
Uber’s technology team says the goal is to make the platform feel more responsive and personalised as the company expands into new services and geographies.
“As we scale, we need AI infrastructure that scales with us,” the Uber executive said, adding that AWS’s custom silicon is expected to “play a central role in how we deliver better experiences to riders, earners and merchants around the world.”
With the latest deal, Uber becomes one of the most prominent companies to publicly back Amazon’s AI hardware at scale, joining a growing group of enterprises experimenting with alternatives to Nvidia‑centric setups.
How well Amazon’s chips perform under Uber’s demanding, real‑time workloads will be closely watched across the tech industry as businesses weigh their next big infrastructure bets in the AI era.
Comments