New AI Incident Response, Multi-Region Agents, and Custom-Domain Status Pages — May 2026
Services Pricing Dashboard

Buildkite Outage History

Uptime record, past incidents, and downtime history for Buildkite.

Checking current status...
68.1% uptime over 91 days
99.9% ✗ 99.5% ✗ 99% ✗ 95% ✗

90-Day Trend

Feb 25May 25

Monthly Uptime

Month Uptime Days Tracked Days with Issues
May 2026 64% 25 9
April 2026 83.3% 30 5
March 2026 58.1% 31 13
February 2026 60% 5 2

Uptime is calculated from daily worst-status snapshots. A day with any non-operational status counts as a day with issues.

Daily Status (Last 91 Days)

Feb 24 Today
Operational Degraded Partial Outage Major Outage Maintenance No Data

Incident History

May 2026
Delayed notifications
major

Started: May 20, 4:40 PM

monitoring
We are seeing recovery across affected customers and continue to monitor
May 20, 5:26 PM
identified
We have identified the issue and applied mitigations and are monitoring recovery We have determined that only a subset of customers are affected by the notification latency.
May 20, 5:06 PM
investigating
We are investigating delays to notifications across all customers
May 20, 4:40 PM
Delayed Test Engine ingestion processing
minor

Started: May 15, 6:51 AM

monitoring
Ingestion of Test Engine execution data from an internal queue to a data store stalled, has been resumed, and is working through the backlog. Visibility of test executions from the past hour hours will be delayed for approximately a further one hour. This has been a recurring issue; an architectural change is coming soon to eliminate this failure mode.
May 15, 6:51 AM
Error rates increasing

Started: May 13, 3:14 PM

investigating
We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.
May 13, 3:14 PM
Delayed Test Engine ingestion processing
minor

Started: May 12, 12:59 PM

monitoring
We are currently experiencing delayed processing of Test Engine data. We have identified and applied a fix for the issue but are expecting to continue to experience delays while we clear the ingestion backlog
May 12, 12:59 PM
Delayed Test Engine ingestion processing
minor

Started: May 8, 9:10 PM

investigating
We are currently experiencing delayed processing of Test Engine data. We have identified and applied a fix for the issue but are expecting to continue to experience delays while we clear the ingestion backlog. At the current processing rate we expect the backlog to be cleared by approximately Sat 09 May 2026 00:00 UTC
May 8, 9:10 PM
Delayed Test Engine ingestion processing
minor

Started: May 8, 12:20 PM

monitoring
We are currently experiencing delayed processing of Test Engine data. We have identified and applied a fix for the issue but are expecting to continue to experience continued processing delays while we clear the ingestion backlog. At the current processing rate we expect the backlog to be cleared by approximately Fri 08 May 2026 13:30 UTC.
May 8, 12:20 PM
AWS us-east-1 single availability zone outage
minor

Started: May 8, 1:12 AM

monitoring
Despite the ongoing AWS incident, our own services are now stable. We are continuing to monitor our services closely, and are ready for further action should the need arise. We are also watching AWS services closely as they recover.
May 8, 8:07 AM
investigating
We are continuing to move infrastructure resources out of the affected AWS Availability Zone. Brief latency and error blips may continue while these manual failovers occur. (Apologies if you receive duplicated notifications for this update.)
May 8, 7:22 AM
investigating
We are continuing to move infrastructure resources out of the affected AWS Availability Zone. Brief latency and error blips may continue while these manual failovers occur.
May 8, 7:04 AM
investigating
We are continuing to move infrastructure resources out of the affected AWS Availability Zone. Brief latency and error blips will unfortunately continue while these manual failovers occur.
May 8, 5:45 AM
investigating
We are actively moving resources out of us-east-1c. Similar brief latency and error blips will be visible to customers while these manual failovers occur.
May 8, 5:10 AM
investigating
We have provisioned additional capacity in unaffected availability zones so that they are able to support the additional load. Automatic failovers continue to occur where necessary. Some latency and transient errors will be visible to customers.
May 8, 4:08 AM
investigating
We are continuing to actively monitor the impacts of this availability zone outage for Buildkite customers. Some transient errors are visible due to availability zone failover events.
May 8, 2:35 AM
investigating
A small subset of our customers are experiencing delayed notifications. We are actively provisioning additional capacity for these customers. Availability zone automatic failovers are occurring in response to the outage, and this is causing some brief error blips for some customers.
May 8, 1:50 AM
investigating
We're aware that AWS is reporting availability zone failures in us-east-1. We are monitoring the situation but so far there is no customer impact.
May 8, 1:12 AM
Delays in job dispatch, webhook processing, and outbound webhooks
major

Started: May 7, 10:45 PM

monitoring
We have now monitoring the incident. We are seeing most customers have recovered, and some showing signs of recovery.
May 7, 11:50 PM
identified
We've identified the issue and are working on applying mitigations. At this time we can confirm inbound and outbound webhooks, and notifications are delayed.
May 7, 11:13 PM
investigating
We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.
May 7, 10:45 PM
Jobs not starting on hosted agents and agent-stack-k8s
major

Started: May 7, 8:48 AM

identified
We're currently seeing recovery at 50% rate. We'll provide next update soon.
May 7, 9:24 AM
investigating
We've identified issue with job acquiring endpoint. We're rolling back now. We'll provide next update in ~20 minutes.
May 7, 9:09 AM
investigating
We've spotted that something has gone wrong. We're currently investigating the issue with new builds not starting.
May 7, 8:48 AM
Test Engine: Delayed processing of test result ingestion
minor

Started: May 6, 3:57 AM

monitoring
We've identified the issue and the system is currently processing the backlog of test executions
May 6, 4:21 AM
investigating
A process writing test results to our Test Engine data store stalled, we've restarted the process and are seeing it catching up. We expect to be fully caught up on the backlog within the next couple of hours.
May 6, 3:57 AM
Delayed notifications
minor

Started: May 4, 5:02 PM

monitoring
Applied remediations have resolved the previous notification delays affecting a subset of our customers. We're continuing to monitor the affected services for stability.
May 4, 6:54 PM
identified
We've identified the source of the notification delays affecting a subset of our customers. Our engineers are applying remediations to reduce these delays.
May 4, 6:06 PM
investigating
We are investigating delays with build and job notifications for a subset of customers.
May 4, 5:02 PM
Increased latency and error rates

Started: May 4, 6:02 AM

investigating
We're observing increased latency and error rates in the Agent API for a subset of our customers. We're currently investigating and will provide status updates as they become available.
May 4, 6:02 AM
April 2026
Increased latency and error rates
minor

Started: Apr 29, 5:43 PM

monitoring
We have identified and fixed the issue with the underlying database for a subset of customers. We are now monitoring the issue.
Apr 29, 6:18 PM
investigating
We're observing increased latency and error rates for a subset of our customers. We're currently investigating and will provide status updates as they become available.
Apr 29, 5:43 PM
Increased dispatch latency and error rates
minor

Started: Apr 28, 6:00 PM

monitoring
We have mitigated the issue causing increased Hosted Agents dispatch latency and intermittent timeout errors for a subset of customers. We identified abnormal workload activity that was placing elevated load on a supporting service, and have now blocked that activity and applied additional protections. Service metrics have returned to normal, and we are continuing to monitor closely.
Apr 28, 6:45 PM
identified
The issue has been identified and a fix is being implemented.
Apr 28, 6:26 PM
investigating
We're observing increased error rates and dispatch latency for a subset of our customers. We're currently investigating and will provide status updates as they become available.
Apr 28, 6:00 PM
Auth failures with remote MCP server
minor

Started: Apr 22, 9:19 PM

monitoring
We have rolled back a change on the remote MCP server that was contributing to authentication failures.
Apr 22, 10:44 PM
investigating
We are continuing to investigate errors when authenticating to the remote MCP server.
Apr 22, 10:07 PM
investigating
We are currently investigating reports of authentication failures with the remote MCP server.
Apr 22, 9:19 PM
Delayed processing of test execution
minor

Started: Apr 22, 2:32 AM

monitoring
We noticed a lag in data processing, but our systems are operational and currently working through the backlog. We expect to be fully caught up within the next couple of hours.
Apr 22, 2:32 AM
Degraded performance and increased error rates
major

Started: Apr 8, 10:26 PM

monitoring
We have identified and fixed the issue. We are monitoring and seeing signs of improvement.
Apr 8, 10:40 PM
investigating
We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.
Apr 8, 10:26 PM
March 2026
Hosted Agents jobs immediately cancelled
minor

Started: Mar 31, 7:51 AM

identified
We have identified the issue and are rolling out a fix.
Mar 31, 8:15 AM
investigating
We have received reports from customers that they are unable to start builds on Hosted Agents. Their builds are immediately cancelled. We are investigating.
Mar 31, 7:51 AM
504 errors viewing builds
minor

Started: Mar 27, 7:02 AM

monitoring
The deploy to revert this change is complete and builds are loading normally. We will continue to monitor for any other issues.
Mar 27, 8:08 AM
identified
We've identified a change which we think is the cause of this issue, and we're in the process of reverting it.
Mar 27, 7:18 AM
investigating
We're seeing an increase in 504 errors when viewing pipeline builds. We're investigating this now.
Mar 27, 7:02 AM
Increased Delays with Hosted Agents
minor

Started: Mar 25, 2:26 PM

monitoring
The networking issue has been resolved, dispatch of Hosted Agents has returned to normal levels and no further issues with Git cloning. We are monitoring the situation.
Mar 25, 2:59 PM
identified
The issue has been identified to be related to Networking and affecting Git Mirror cloning.
Mar 25, 2:30 PM
investigating
We are currently investigating this issue.
Mar 25, 2:26 PM