Amazon’s cloud unit is stepping deeper into AI-assisted operations with a new tool aimed at helping companies recover from outages faster. The software, called DevOps Agent, taps into signals from services like Datadog and Dynatrace to predict what went wrong and why. AWS opened the preview on Tuesday, with pricing coming later.
In a blog post announcing the launch, Amazon said: “Today, we’re announcing the public preview of AWS DevOps Agent, a frontier agent that helps you respond to incidents, identify root causes, and prevent future issues through systematic analysis of past incidents and operational patterns.”
Amazon Launches AI DevOps Agent to Compete With Microsoft and Google in SRE Automation
Swami Sivasubramanian, vice president of agentic AI at AWS, told CNBC that the goal is simple: give engineering teams a head start during live incidents. Site reliability engineers spend long shifts keeping services up and responding quickly when something breaks. Startups such as Resolve and Traversal have been chasing this same discipline with their own AI agents, and Microsoft introduced its SRE Agent on Azure earlier this year.
AWS’ version pushes the idea further by coordinating multiple agents at once. Instead of waiting for a human on-call engineer to untangle logs and dashboards, DevOps Agent runs a set of hypotheses in parallel and assigns work to different agents that investigate each one. “By the time the on-call ops team member dials in, they have an incident report with preliminary investigation of what could be the likely outcome, and then suggest what could be the remediation as well,” Sivasubramanian said.
Commonwealth Bank of Australia has already tested the system. AWS said the software pinpointed the root cause of an issue in under 15 minutes — a task that would have taken a seasoned engineer hours.
The tool draws on Amazon’s in-house models and external providers, giving AWS another entry point in its push to sell more software on top of its core infrastructure business. Amazon sparked the cloud era in the mid-2000s by renting servers and storage, and its rivals followed suit.
“AWS DevOps Agent is your always-on, autonomous on-call engineer. When issues arise, it automatically correlates data across your operational toolchain, from metrics and logs to recent code deployments in GitHub or GitLab. It identifies probable root causes and recommends targeted mitigations, helping reduce mean time to resolution. The agent also manages incident coordination, using Slack channels for stakeholder updates and maintaining detailed investigation timelines,” Amazon stated.
The shift since the launch of ChatGPT in 2022 has been toward showing how AI can take on more of the repetitive work developers do every day. AWS rolled out Kiro over the summer, a “vibe coding” tool that writes and modifies source code from text prompts. Google introduced Antigravity for individual developers in November, and Microsoft continues to grow GitHub Copilot through subscription plans.
With DevOps Agent, AWS is bringing that AI push into one of the highest-pressure parts of software operations: the minutes after a service breaks. The company is betting that speed, clarity, and fewer hours lost in incident triage will be enough to draw teams into its new agent-driven workflow.



