GitHub's AI Surge Pushes Microsoft Onto AWS
An agentic-coding load spike outran GitHub's Azure migration, forcing Microsoft to lean on its biggest cloud rival for headroom.
There is a particular flavor of irony in Microsoft routing part of GitHub through Amazon Web Services. Microsoft owns a hyperscale cloud. It spends like one. And yet, according to Business Insider — citing two people familiar with the plans — it is adding AWS capacity to keep GitHub upright after an AI-driven surge in coding activity strained the platform faster than its own migration could absorb.
Microsoft did not confirm AWS by name. A spokesperson told the outlet that "the incredible spike in agentic development" since late 2025 had tested GitHub's infrastructure limits, and that the company is accelerating its move to Azure while exploring a multi-cloud strategy for elasticity and scale. Amazon, per the same report, declined to comment on individual clients. Treat the specific AWS arrangement as single-sourced for now — but the broader capacity crunch is corroborated by GitHub's own public reliability posts, and those are the more interesting documents.
The load curve outran the plan
When Microsoft bought GitHub for $7.5 billion in June 2018, the pitch was developer trust: keep the platform open, let people deploy to any OS, any cloud, any device, and gradually fold its infrastructure toward Azure. Business Insider reports the full Azure migration was targeted for 2027. The problem isn't the destination — it's that the traffic stopped waiting.
The numbers GitHub's own leadership has put on record are the tell. COO Kyle Daigle wrote in April that commits were on pace to hit 14 billion in 2026, up from 1 billion in 2025 — a roughly 14x jump in a single year. Commits aren't revenue and they aren't a clean proxy for useful output, but they are a direct stress signal for a system that has to store code, run checks, process pull requests, reindex search, fire automations and notify collaborators on every one.
GitHub CTO Vlad Fedorov spelled out the engineering response in an April availability update: a plan kicked off in October 2025 to grow capacity 10x, then by February 2026 the target was revised to 30x. Fedorov tied the escalation to agentic development workflows that accelerated sharply in the back half of December 2025 — which lines up with the broader shift from human-paced commits to agents that open PRs, run CI and iterate in loops a person never could.
Migrating a moving target
The May availability report shows the migration is genuinely underway, not aspirational. By that point:
- 40% of monolith traffic was being served from Azure, up from 8% in February
- Git traffic was at 30%
- Repository replication had reached 99%
- Effective capacity had more than doubled in four months
That's real movement. It also explains why a second cloud suddenly looks attractive: GitHub is trying to shard, replicate and harden a mature collaboration platform while re-pointing its traffic while a new class of machine-generated load slams into old shared systems. You cannot freeze a platform of GitHub's size to renovate it, so every change lands on a system already running hot.
The failure modes are exactly what you'd expect under those conditions. The same May report disclosed nine incidents that degraded service, including a May 4 disruption caused by a schema migration on a heavily used database table that cascaded into pull requests, issues, Actions, webhooks and Git operations. Anyone who has watched an online DDL change lock a hot table at the wrong moment knows that story. The difference is the blast radius: at GitHub's scale, one bad migration is a global incident.
When reliability becomes a product threat
The strategic problem isn't downtime in the abstract — it's what the downtime interrupts. GitHub sells the workflow: review the PR, merge the code, run the Action, search the issue, ship the release. When those stall, developers don't perceive a capacity-planning problem somewhere in a data center. They perceive GitHub as the thing standing between them and their work.
The public face of that frustration in April was Mitchell Hashimoto, HashiCorp co-founder and creator of the Ghostty terminal emulator, who said he would move Ghostty off GitHub after 18 years because, as The Register reported, the platform was "no longer a place for serious work" when it locked him out for hours at a time. His complaint was operational, not ideological — and that's precisely why it stings. Hashimoto helped build Vagrant and Terraform; he is the high-signal maintainer whose habits everyone else copies. Competitors don't need to out-network GitHub. They only need to be credible enough to catch the workflows GitHub is fumbling.
And the competitive context has shifted under GitHub's feet. Business Insider notes rising pressure from AI coding tools such as Cursor and Anthropic's Claude Code, and reports that an internal Microsoft meeting late last year weighed overhauling GitHub to compete with them. That reframes every outage. GitHub is no longer just where code lives — it's meant to be Microsoft's control plane for AI-assisted software development. A platform incident in that role is simultaneously a Copilot problem, an Azure problem and a Microsoft developer-strategy problem.
The broader signal
Strip away the vendor awkwardness and the takeaway is blunt: even a company that operates a top-tier cloud can hit a capacity wall when AI-native workloads change the shape of demand overnight. Agentic development doesn't just add users — it changes the load profile, multiplying automated writes against systems designed for human cadence. Microsoft's apparent willingness to pay a rival rather than risk GitHub downtime is a clean statement of priorities: the operational cost of an outage now outweighs the optics of the logo on the invoice. That's a lesson worth filing away for anyone whose own roadmap assumes their infrastructure will scale linearly into the agent era. It might. The companies with the most headroom are discovering it doesn't always.
Sources & further reading
- Microsoft turns to AWS as GitHub faces AI capacity crunch — runtimewire.com
Emeka has spent over a decade tracking threat actors, vulnerability disclosures, and the evolving landscape of application security, bringing a sharp continent-spanning perspective to his reporting. He's known for translating dense CVE advisories into clear, actionable context that developers and security teams alike actually read.
Discussion 5
i guess you could say microsoft's azure migration was a bit of a code red, had to call in the aws cavalry to keep github from going dark, talk about a plot twist
i'm not surprised they're having scaling issues, all this ai hype is just a bunch of extra load on already complex systems, just use postgres and focus on writing good sql instead of trying to reinvent the wheel with fancy new tech
wonder how they're handling backfills
@data_eng_dee, backfills are just a fancy term for what we used to call 'catching up with the queue' - i've seen this movie before, and it usually involves a mix of batch processing and some clever caching to get things back on track
@greybeard_unix, that's a great point about backfills, but i'm more curious about the implications of this surge on github's consensus protocols - how are they handling the increased load on their distributed systems, and what tradeoffs are they making to ensure consistency in the face of such rapid growth?