New AI Incident Response, Multi-Region Agents, and Custom-Domain Status Pages — May 2026
Services Pricing Dashboard

Fly.io Outage History

Uptime record, past incidents, and downtime history for Fly.io.

Checking current status...
50.5% uptime over 91 days
99.9% ✗ 99.5% ✗ 99% ✗ 95% ✗

90-Day Trend

Feb 25May 25

Monthly Uptime

Month Uptime Days Tracked Days with Issues
May 2026 44% 25 14
April 2026 56.7% 30 13
March 2026 51.6% 31 15
February 2026 40% 5 3

Uptime is calculated from daily worst-status snapshots. A day with any non-operational status counts as a day with issues.

Daily Status (Last 91 Days)

Feb 24 Today
Operational Degraded Partial Outage Major Outage Maintenance No Data

Incident History

May 2026
Networking performance degraded in BOM and SJC
minor

Started: May 23, 1:00 AM

monitoring
Network performance has been restored and we're continuing to monitor.
May 23, 2:14 AM
investigating
We're currently looking into this issue.
May 23, 1:01 AM
IPv6 outage for some machines in SIN
minor

Started: May 22, 1:32 PM

monitoring
A fix has been implemented and we are monitoring the results.
May 22, 1:57 PM
investigating
We are continuing to investigate this issue.
May 22, 1:34 PM
investigating
We are currently investigating this issue.
May 22, 1:32 PM
Network issues in SIN region
major

Started: May 21, 6:30 PM

monitoring
Our upstream provider has implemented a fix and all apps and Managed Postgres clusters are now reachable. For the time being, ingress traffic is being re-routed to other regions, so users around the Singapore area may experience higher latency.
May 21, 6:45 PM
investigating
We are investigating network issues in the Singapore region. Apps may experience higher latency or be unreachable at this time. Some Managed Postgres clusters may be unreachable.
May 21, 6:30 PM
IPv6 outage for some machines in ORD

Started: May 21, 12:02 PM

investigating
We're working with our upstream providers to investigate an IPv6 networking failure in ORD. Impacted apps may wish to temporarily provision additional capacity in nearby regions.
May 21, 12:02 PM
Networking issues in SIN
minor

Started: May 20, 8:06 AM

identified
All MPGs in SIN are reachable again. We are seeing high latency to the affected provider. This may affect a subset of machines hosted in SIN.
May 20, 8:59 AM
identified
One of our upstreams is experiencing high packet loss and latency. We are actively working with them.
May 20, 8:23 AM
investigating
Machines may see high packet loss. Some MPGs are unable to connect to their config store and may be unreachable right now.
May 20, 8:06 AM
Networking issues with egress IP addresses in SYD
minor

Started: May 20, 5:05 AM

monitoring
A fix has been implemented and we are monitoring the results.
May 20, 7:11 AM
identified
We've identified the issue and are working on a fix.
May 20, 5:57 AM
investigating
We are currently investigating an issue affecting networking for new machines whose apps have assigned egress IP addresses in our SYD region
May 20, 5:05 AM
Issues with the Fly.io dashboard
major

Started: May 19, 11:19 PM

monitoring
A fix has been implemented and we are monitoring the results.
May 20, 12:26 AM
identified
We are continuing to work on a fix for this issue.
May 19, 11:50 PM
identified
The issue has been identified and a fix is being implemented.
May 19, 11:34 PM
investigating
We're currently investigating an issue where the Fly.io dashboard is failing to load in some cases.
May 19, 11:19 PM
Logs issues in IAD
major

Started: May 19, 7:16 PM

investigating
We are investigating an issue with logs and metrics in IAD region. New logs and metrics from machines in IAD region may be missing, but past logs/metrics are still accessible. Apps continue to run.
May 19, 7:16 PM
Proxy issues in SIN region
major

Started: May 19, 11:15 AM

monitoring
A fix has been implemented and we are seeing proxy performance in SIN return to normal. All Managed Postgres clusters in the region are reachable. We are continuing to monitor to ensure stable recovery.
May 19, 11:22 AM
investigating
We are investigating issues with fly-proxy on a subset of hosts in the Singapore region. Apps are still running, but requests to/from some apps may fail, and some Managed Postgres clusters may be inaccessible.
May 19, 11:15 AM
Some Managed Postgres clusters in FRA are unreachable
major

Started: May 16, 12:45 PM

monitoring
All affected clusters have recovered
May 16, 2:03 PM
identified
Some of unreachable clusters are showing recovery. We are still fixing the root cause.
May 16, 1:40 PM
identified
The issue has been identified and a fix is being implemented.
May 16, 1:22 PM
investigating
We are continuing to investigate this issue.
May 16, 12:45 PM
investigating
We are currently investigating this issue.
May 16, 12:45 PM
fly ssh console returns error 500

Started: May 15, 4:29 PM

monitoring
We've deployed a fix and are monitoring as error rates normalize. `fly ssh console` should be working now.
May 15, 4:40 PM
identified
A problem with our vault used to issue temporary certificates for SSH sessions is causing calls to `fly ssh console` and `fly console` to return error 500. Our team has identified the cause and is deploying a fix.
May 15, 4:29 PM
log search unavailable
major

Started: May 11, 8:02 PM

monitoring
A fix has been implemented and we are monitoring the results. Logs should again be available through Log search in Grafana.
May 11, 8:32 PM
investigating
Log search in Grafana is currently unavailable. You may see `failed to make http request: 503` errors when accessing logs from fly-metrics.net at this time. App logs are still available using the `fly logs` command and in the Fly.io dashboard.
May 11, 8:02 PM
Upstash Redis Outage
major

Started: May 11, 2:36 PM

monitoring
A fix has been implemented and we are seeing Upstash Redis connectivity return to normal across all regions. We continuing to monitor to ensure stable recovery.
May 11, 6:26 PM
investigating
We are continuing to work with Upstash on this issue. We have received reports of partial recovery for some users, however we are still seeing higher levels of degraded or failing connections connecting to Upstash Redis databases at this time.
May 11, 5:31 PM
investigating
We are continuing to work with Upstash on this issue.
May 11, 3:04 PM
investigating
We are working with Upstash to investigate issues with their Fly hosted Redis service. Users may see degraded or failing connections connecting to their Upstash Redis databases at this time.
May 11, 2:36 PM
Certificate Issuance failing due to LetsEncrypt Outage
critical

Started: May 8, 6:45 PM

monitoring
We are seeing recovery and certificates are now issuing normally. We are continuing to monitor to ensure full recovery.
May 8, 9:02 PM
identified
Due to a service outage at LetsEncrypt, creating new certificates with `fly certs add` is failing. Existing certificates and `*.fly.dev` preview certificates are not impacted. For additional details please see LetsEncrypt's statuspage https://letsencrypt.status.io/pages/incident/55957a99e800baa4470002da/69fe2d6698ca07050eb4b1b3
May 8, 8:25 PM
Connectivity issues in SJC
major

Started: May 7, 5:49 PM

monitoring
A fix has been implemented and we are monitoring the results.
May 7, 6:18 PM
investigating
Some hosts in SJC are currently experiencing an upstream network issue. Apps running on these hosts may be temporarily unavailable.
May 7, 5:49 PM
Intermittent machines issues in BOM
minor

Started: May 7, 10:16 AM

identified
Creation and updating of machines in BOM are affected. Some metrics and logs for resources in BOM may be delayed.
May 7, 10:16 AM
Elevated error rates on List Machines endpoint
minor

Started: May 6, 12:06 AM

investigating
We are currently investigating this issue.
May 6, 12:06 AM
Errors Setting and Updating Secrets on Apps
major

Started: May 5, 1:18 PM

investigating
Creation of new apps or changing secrets on existing apps fails
May 5, 1:22 PM
investigating
Creation of new apps or changing secrets on existing apps fails
May 5, 1:22 PM
investigating
We are continuing to investigate this issue.
May 5, 1:19 PM
investigating
We are currently investigating this issue.
May 5, 1:18 PM
Log search unavailable
minor

Started: May 4, 6:57 PM

monitoring
We have a mitigation in place and are monitoring results.
May 4, 8:00 PM
investigating
Log search in Grafana is currently unavailable. You may see `failed to make http request: 502` errors when accessing logs from fly-metrics.net at this time. App logs continue to be available using the `fly logs` command and in the Fly.io dashboard.
May 4, 6:57 PM
April 2026
flyctl deploy creating new app instances
minor

Started: Apr 28, 11:50 PM

monitoring
A fix has been implemented and we are monitoring the results.
Apr 29, 12:31 AM
identified
The issue has been identified and a fix is being implemented.
Apr 29, 12:07 AM
investigating
We're investigating an issue where fly deploy is creating new Fly machine instances rather than updating existing ones, leading to apps with a mixed state. We're currently investigating the issue. As a workaround, please try removing the "processes = [ "app" ]" line from your fly.toml configuration file and redeploying. Another workaround is to downgrade flyctl to 0.4.40 - this should resolve the issue in the meantime.
Apr 28, 11:50 PM
Slow machines operations in IAD region
minor

Started: Apr 24, 10:45 PM

monitoring
Network packet loss has returned to normal levels. We are monitoring the Machines API for stability.
Apr 24, 11:19 PM
investigating
We are continuing to investigate this issue.
Apr 24, 11:18 PM
investigating
We are deploying a partial mitigation while we continue investigating.
Apr 24, 10:58 PM
investigating
We are currently investigating the issue. Only a portion of machines within the region are impacted.
Apr 24, 10:45 PM
Errors when adding or editing Github integrations for deployments
major

Started: Apr 23, 3:05 PM

monitoring
A fix has been implemented and we are monitoring the results.
Apr 23, 3:39 PM
identified
We are continuing to work on a fix for this issue.
Apr 23, 3:22 PM
identified
The issue has been identified and a fix is being implemented.
Apr 23, 3:22 PM
investigating
We're investigating reports of "500" errors when trying to add a new Github integration or edit an existing Github integration in Fly.io/dashboard. This only affects "Launch an app from Github" or trying to change settings for an app set up this way. Existing integrations continue to work normally. It does not affect deploys done with `flyctl` or existing, running apps.
Apr 23, 3:05 PM
Errors (5xx, timeouts) in Fly.io dashboard
major

Started: Apr 23, 11:17 AM

monitoring
A fix has been implemented and we are monitoring the results.
Apr 23, 11:45 AM
identified
The issue has been identified and a fix is being implemented.
Apr 23, 11:35 AM
investigating
We are investigating issues with web dashboard.
Apr 23, 11:17 AM
Increased latency in SIN
minor

Started: Apr 20, 2:29 PM

identified
We are currently working on resolving increased latencies in our Singapore region.
Apr 20, 3:29 PM
TLS certificate issues
major

Started: Apr 17, 1:06 PM

monitoring
A fix has been implemented and we are monitoring the results.
Apr 17, 3:34 PM
investigating
We are investigating an issue with the Vault server that stores TLS certificates. Provisioning new TLS certificates may fail, and connecting to domains whose existing certificate has not yet been cached may fail.
Apr 17, 1:06 PM
Network issues in SYD

Started: Apr 15, 11:08 AM

monitoring
We've identified the issue and applied a fix. All services should be working as normal.
Apr 15, 11:40 AM
investigating
We're currently investigating some networking issues in SYD. This is affecting a number of our central services.
Apr 15, 11:08 AM
Heightened latency in ORD

Started: Apr 12, 6:50 PM

monitoring
A fix has been implemented and we are monitoring the results.
Apr 12, 7:26 PM
investigating
We are currently investigating heightened network latency in ORD.
Apr 12, 6:50 PM
Managed Postgres control plane instability in NRT (Tokyo)
minor

Started: Apr 10, 6:42 PM

monitoring
A fix has been implemented and we are seeing MPG performance in NRT normalize. We are continuing to monitor to ensure a stable recovery
Apr 10, 8:32 PM
identified
The issue has been identified and a fix is being implemented. Users with clusters in NRT may continue to see instability at this time
Apr 10, 8:13 PM
investigating
We are investigating instability in the MPG control plane in the NRT (Toyko, Japan) region causing unexpected cluster failovers. Clusters return to health shortly after, but some users with clusters in NRT may see dropped connections or degraded performance at this time.
Apr 10, 6:42 PM
Unavailable hosts in ORD region
major

Started: Apr 9, 7:29 PM

investigating
Some hosts in our Chicago (ORD) region are currently inaccessible. We are working with our provider to resolve this issue. To see if you are affected, please visit the personalized status page: https://fly.io/status A small amount of Managed Postgres clusters may also be inaccessible at this time.
Apr 9, 7:29 PM
Managed Postgres Control Plane Issues in SYD
major

Started: Apr 9, 3:50 AM

identified
We are seeing an improvement in control plane performance in the SYD region. Some clusters in the region currently are showing degraded standby nodes and we are working to bring those back to full health.
Apr 9, 4:12 AM
investigating
We are investigating elevated control plane issues for Managed Postgres clusters in SYD. The majority of clusters appear to be running fine, but new creates, backup restores, and upgrades may show errors or take longer than usual to complete. Some clusters will have seen a failover event from primary to standby.
Apr 9, 3:50 AM
Metrics currently experiencing issues
major

Started: Apr 8, 8:34 AM

monitoring
We are continuing to monitor for any further issues.
Apr 8, 11:02 AM
monitoring
We have implemented a fix. We're monitoring the cluster for further issues.
Apr 8, 11:00 AM
investigating
We are currently investigating an issue with our metrics cluster.
Apr 8, 8:34 AM
GraphQL API / Dashboard Issues
critical

Started: Apr 7, 3:08 PM

monitoring
A fix has been implemented and we are monitoring the results.
Apr 7, 3:39 PM
identified
We have restored GraphQL and dashboard availability, but some actions (e.g. app state updates) may still be delayed.
Apr 7, 3:17 PM
investigating
We are investigating issues with our GraphQL API and web dashboard
Apr 7, 3:08 PM
March 2026
Low Capacity in SIN and AMS regions

Started: Mar 29, 3:00 PM

monitoring
We've freed up additional room in the SIN and AMS regions and are monitoring capacity.
Mar 29, 3:35 PM
monitoring
We've freed up additional room in the SIN and AMS regions and are monitoring capacity.
Mar 29, 3:33 PM
identified
We are currently investigating capacity issues in SIN and AMS regions that are affecting: - Machine Create and Start events - Deployments, due to affected, degraded Remote Builders - Sprite startup from cold state
Mar 29, 3:19 PM
identified
This may also affect: - Remote builders in AMS and SIN regions, which could currently be experiencing degraded performance or failures. - Sprites starting from a cold state, which may experience failures in starting
Mar 29, 3:13 PM
identified
We are currently investigating elevated errors when creating and starting machines in the SIN and AMS regions. Choosing other regions to create or deploy may help in the meantime
Mar 29, 3:00 PM
Low capacity in IAD
minor

Started: Mar 27, 6:08 PM

monitoring
With the additional capacity we've brought online, machine start failure rates in IAD have now recovered. We'll continue to monitor IAD capacity.
Mar 27, 9:09 PM
identified
We've brought some additional capacity online in IAD and are seeing improvements, and we're continuing to work on adding more and freeing up additional room.
Mar 27, 7:21 PM
investigating
We're continuing to evaluate our options for increasing short-term capacity in the IAD region.
Mar 27, 6:47 PM
investigating
We're currently investigating capacity issues in IAD that is preventing machine starts (machine creates are currently unaffected). This may result in deploys failing to complete (even for apps outside of the IAD region). As a workaround, using legacy Fly builders explicitly located in another region (i.e., `FLY_REMOTE_BUILDER_REGION=lhr fly deploy --depot=false --recreate-builder`) may help in the meantime.
Mar 27, 6:08 PM
Machine Creates Failing in ORD Region
major

Started: Mar 26, 3:21 PM

monitoring
We've implemented a fix and have seen error rates for machine creates in ORD drop off. We're continuing to monitor the results.
Mar 26, 5:28 PM
identified
We've identified the cause of this increased failure rate and a fix is in progress. We are seeing most creates in ORD succeed at this time, though failure rate is still above baseline.
Mar 26, 4:50 PM
investigating
We are continuing to investigate this issue. We are seeing 408 errors decreasing in ORD, though still above baseline.
Mar 26, 4:08 PM
investigating
We are currently investigating elevated errors creating machines in the ORD (Chicago, Illinois) region. Users may see `failed to launch VM: request returned non-2xx status: 408` errors when creating, updating, or scaling machines in ORD. Existing, already running machines in the ORD region continue to run as normal.
Mar 26, 3:21 PM
Network issues in FRA region
critical

Started: Mar 26, 12:37 PM

identified
Some Managed Postgres clusters in FRA region are still unreachable, we are investigating this issue.
Mar 26, 1:16 PM
monitoring
Apps and Managed Postgres clusters in FRA region should be back online at this time. We are monitoring for any further issues.
Mar 26, 1:14 PM
investigating
We are investigating network issues in FRA region. Apps and/or Managed Postgres clusters in the region may be inaccessible at this time.
Mar 26, 12:37 PM
Backend errors when trying to use Grafana to view logs

Started: Mar 23, 3:18 PM

monitoring
We've deployed a fix and are monitoring the results. Logs are now be visible on Grafana.
Mar 23, 3:55 PM
identified
Using the Logs panel in Grafana at https://fly-metrics.net/ will show a 502 error from the backend and won't show any logs. You can use `fly logs` or the live log viewer directly on https://fly.io/dashboard to view streaming logs for the time being.
Mar 23, 3:41 PM
investigating
Using the Logs panel in Grafana at https://fly-metrics.net/ will show a 502 error from the backend and won't show any logs. You can use `fly logs` or the live log viewer directly on https://fly.io/dashboard to view streaming logs for the time being.
Mar 23, 3:18 PM
Machines failing to start in DFW
minor

Started: Mar 20, 7:26 AM

monitoring
Machine start success rates in DFW have improved but we are continuing to monitor and make further adjustments. We will provide updates as the situation progresses.
Mar 21, 8:26 AM
monitoring
In addition to freeing up existing capacity, the team has provisioned new capacity in DFW and we are monitoring the results.
Mar 20, 12:45 PM
monitoring
We freed up some capacity on our workers to allow for successful Machine starts.
Mar 20, 8:08 AM
investigating
The Machines start failure rate is elevated in DFW.
Mar 20, 7:26 AM
Metrics currently experiencing issues
critical

Started: Mar 19, 6:28 AM

monitoring
We have implemented a fix. There has been approximately 1h of lost metrics from 06:07UTC. We're monitoring the cluster for further issues
Mar 19, 7:12 AM
investigating
We are currently investigating an issue with our metrics cluster.
Mar 19, 6:28 AM
IPv6 networking issues in SJC region
major

Started: Mar 18, 4:12 PM

monitoring
A fix has been implemented and we are monitoring the results.
Mar 18, 4:31 PM
investigating
We are investigating intermittent network issues in SJC region impacting outbound public IPv6 access from Machines. Connecting to IPv6 internet resources from apps hosted in SJC region may be slow or fail at this time. IPv4 access, as well as 6PN private networking, are unaffected.
Mar 18, 4:12 PM
Fly ssh console command failing
minor

Started: Mar 18, 2:12 PM

identified
We have identified an issue causing new `fly ssh console` connections to fail with 500 errors. A fix is in progress.
Mar 18, 2:12 PM
Connection Issues in SJC
minor

Started: Mar 18, 2:07 PM

monitoring
Between 13:55 and 14:03 UTC machines and MPG clusters hosted in the SJC region saw elevated connection errors. Users may have seen errors connecting to or from most machines in the region, as well as with deployments or updates to machines in the region. Networking has returned to normal in the region, and we are continuing to monitor closely to ensure stable recovery.
Mar 18, 2:07 PM
Machines failing to start in DFW
major

Started: Mar 18, 9:58 AM

monitoring
A fix has been implemented and we are monitoring the results.
Mar 18, 12:40 PM
identified
The team is currently rolling out additional capacity in DFW which should help ease Machine start failures across the region.
Mar 18, 11:44 AM
investigating
We are investigating reports of machines failing to start in the DFW (Dallas) region with "insufficient memory" errors. This may cause deployment failures for applications running in DFW. Our team is actively working to restore full capacity in the region. If you are affected, deploying to an alternate region may serve as a temporary workaround. We will provide updates as the situation progresses.
Mar 18, 9:58 AM
Elevated 502 errors when starting Sprites in LAX and ORD
minor

Started: Mar 16, 9:59 PM

investigating
We're currently investigating an elevated number of 502 errors when attempting to start Sprites in LAX and ORD.
Mar 16, 9:59 PM
Sprite Operations: 401 errors for certain organizations

Started: Mar 14, 1:33 PM

monitoring
A fix has been implemented and we are monitoring the results.
Mar 14, 1:45 PM
investigating
Organizations with numerical prefixes might experience failing sprite operations ( like creating a sprite, listing sprites, etc... ) due to 401 errors
Mar 14, 1:44 PM
monitoring
Root cause has been identified and a fix has been applied
Mar 14, 1:42 PM
investigating
Organizations with numerical prefixes might experience failing sprite operations ( like creating a sprite, listing sprites, etc... ) due to 401 errors
Mar 14, 1:41 PM
monitoring
Organizations with numerical prefixes might experience failing sprite operations ( like creating a sprite, listing sprites, etc... ) due to 401 errors
Mar 14, 1:33 PM
Sprites Operations: 401 errors for certain organizations

Started: Mar 14, 12:30 PM

monitoring
Organizations with names prefixed with numerical digits may experience 401 errors. Affected operations include actions such as Sprite creation, listing, etc... A fix has been implemented and we are monitoring the results!
Mar 14, 1:55 PM