DigitalOcean Outage History

Uptime record, past incidents, and downtime history for DigitalOcean.

Checking current status...

90-Day Trend

Monthly Uptime

Month	Uptime	Days Tracked	Days with Issues
July 2026	33.3%	9	6
June 2026	73.3%	30	8
May 2026	64.5%	31	11
April 2026	52.4%	21	10

Uptime is calculated from daily worst-status snapshots. A day with any non-operational status counts as a day with issues.

Daily Status (Last 91 Days)

Apr 10 Today

Operational Degraded Partial Outage Major Outage Maintenance No Data

Incident History

July 2026

Limited access to Deepseek V4 Pro model

major

Started: Jul 4, 10:13 PM

monitoring

Our Engineering team has implemented a fix for the issue with the Deepseek V4 Pro model in Serverless Inference and Agent Platform. The model should now be operational, and users should no longer receive error 429 messages when attempting to use it. We are currently monitoring the situation to ensure the fix is successful and the model is functioning as expected. We will post an update if any further issues arise. If you continue to experience problems, please open a ticket with our Support t...

Jul 4, 11:18 PM

investigating

Our Engineering team is investigating reports of an incident affecting the Deepseek V4 Pro model in Serverless Inference and Agent Platform. Users may experience errors when attempting to use this model, specifically receiving error 429 messages. We apologize for the inconvenience and are working to resolve the issue as soon as possible. We will provide an update once we have more information.

Jul 4, 10:13 PM

Droplet Actions in NYC3

Started: Jul 4, 2:42 PM

monitoring

Our Engineering team has implemented a fix to address the issue affecting Droplets in the NYC3 region. The issue, which began at 12:38 UTC, caused disruptions to customers attempting to perform actions on their Droplets in the NYC3 region. During this time, customers may also have experienced errors when trying to create Droplets in this region. We are actively monitoring the situation to ensure the fix remains effective and will provide another update once the issue has been fully resolved.

Jul 4, 2:42 PM

Managed Database Clusters

minor

Started: Jul 2, 10:04 PM

monitoring

Our engineering team has implemented a fix to resolve the issue with Managed Database Clusters and is monitoring the situation. We will post an update as soon as the issue is fully resolved.

Jul 3, 6:08 AM

investigating

Our engineering team is still investigating the issue affecting Managed Database Clusters. Users may still encounter delays creating/scaling/forking/restoring the aforementioned Managed Database Clusters. We are working to resolve this as soon as possible. we apologize for the inconvenience. We will post an update here once we have more information.

Jul 3, 2:28 AM

investigating

Our engineering team is investigating an issue affecting Managed Database Clusters. Currently, users may encounter delays when creating/scaling/forking and restoring Standard MySQL, Standard PostgreSQL, OpenSearch, Kafka, and Valkey clusters through the Cloud Control Panel or API. We apologize for the inconvenience and will provide further updates as soon as more information is available.

Jul 2, 10:04 PM

DNS Lookup Failures for Managed Database

minor

Started: Jul 1, 11:35 AM

investigating

We are continuing to investigate this issue.

Jul 1, 12:15 PM

investigating

As of 06:39 UTC, our Engineering team is investigating reports of intermittent DNS lookup failures for Managed Database hostnames from Managed Kubernetes, primarily affecting connections from Managed Kubernetes to Managed Database hostnames. At this point, customers in the FRA1 region may experience intermittent connectivity issues. We apologize for the inconvenience and will share an update once we have more information.

Jul 1, 11:35 AM

Droplet Resize

Started: Jul 1, 11:13 AM

monitoring

Our Engineering team has identified the root cause of the issue affecting Droplet resizes using the "Downscale Anytime" option and has implemented a fix. Users should now be able to resize their Droplets using the "Downscale Anytime" option without experiencing any issues or errors. We are actively monitoring the situation to ensure the fix remains effective and will provide another update once the issue has been fully resolved.

Jul 1, 12:43 PM

investigating

Our Engineering team is investigating an issue affecting Droplet resizes using the "Downscale Anytime" option. At this time users may find this option unavailable or unresponsive in the Cloud Control Panel. As a workaround, resizes can be performed via the API while we work to resolve this issue. We apologize for the inconvenience and will share an update once we have more information.

Jul 1, 11:13 AM

Agent Timeouts While Retrieving Data from Knowledge Bases

minor

Started: Jul 1, 6:25 AM

identified

Our Engineering team has identified the issue causing Agents to experience timeouts while retrieving data from Knowledge Bases. Please be assured that our Engineering team is actively working on a fix and is treating this issue with high priority. We sincerely apologise for any inconvenience this may have caused and appreciate your patience and understanding. If you have any further questions, please create a support ticket so that we can investigate your specific case further.

Jul 1, 8:39 AM

investigating

Our Engineering team is currently investigating an issue where Agents are experiencing timeouts while retrieving data from Knowledge Bases. As a result, affected Agents may fail to retrieve data and return the following error: "Failed to retrieve data from Knowledge base(s) - timeout" Please be assured that we are treating this as a high-priority issue and are actively working to mitigate it. We sincerely apologise for any inconvenience this may have caused and appreciate your patience and...

Jul 1, 6:25 AM

June 2026

Anthropic Inference Model Availability

Started: Jun 27, 4:43 AM

monitoring

Our Engineering team has mitigated an issue with Anthropic models. Previously, users may have encountered 400 errors when attempting to use any Anthropic model. Although the root cause of the issue is still being addressed by Anthropic, users should now be able to access and use Anthropic models again. We will continue to monitor the situation and provide updates if necessary. If you continue to experience problems, please open a ticket with our Support team. We apologize for any inconvenien...

Jun 27, 4:43 AM

Monitoring Graphs in the Cloud Control Panel

minor

Started: Jun 19, 2:00 PM

monitoring

Our engineering team has implemented the necessary fixes to address the issue affecting the visibility of monitoring graphs within the Cloud Panel. Users should now be able to view monitoring graphs for their services, including Droplets with DO Agent installed, Load Balancers, Databases, etc. We are currently monitoring the situation to ensure that the service has returned to normal operation and remain stable. We appreciate your patience and will provide an update once the issue is fully c...

Jun 19, 2:50 PM

investigating

Our Engineering team is currently investigating an issue affecting the visibility of monitoring graphs within the Cloud Panel. During this period, users may notice missing or unavailable monitoring graphs for services such as Droplets(with DO agent installed), Load Balancers, Databases, etc. We apologize for the inconvenience caused. We'll update once we have more information

Jun 19, 2:00 PM

Intermittent 500 Errors on Serverless/GenAI Inference API

minor

Started: Jun 16, 10:00 PM

monitoring

Our engineering teams have successfully begun implementing mitigation steps to resolve the connectivity issues affecting the inference API. We will provide another update once the API has fully recovered and error rates return to normal.

Jun 16, 11:29 PM

investigating

We are actively investigating an issue causing elevated HTTP 500 error rates for customers utilizing our Serverless/GenAI Inference API. Customer Impact: Customers making calls to the inference API—specifically targeting /v1/* endpoints—will experience intermittent HTTP 500 errors and failed requests.

Jun 16, 10:52 PM

DeepSeek V4 Pro is returning HTTP 429 "Rate limit exceeded"

major

Started: Jun 11, 2:24 PM

identified

The issue has been identified and a fix is being implemented.

Jun 11, 2:31 PM

investigating

We are continuing to investigate this issue.

Jun 11, 2:30 PM

investigating

We are currently facing issues with Gradient AI DeepSeek V4 Pro is returning HTTP 429 "Rate limit exceeded"

Jun 11, 2:24 PM

DNS API Service

minor

Started: Jun 4, 10:12 AM

monitoring

Our Engineering team has implemented a fix to resolve the issue impacting our DNS API service. Users should now be able to perform domain and DNS record management operations successfully through the Control Panel and API. Additionally, services affected by this issue, including Let's Encrypt certificate provisioning, MongoDB cluster creation, App Platform deployments, and DigitalOcean Kubernetes (DOKS) cluster create and delete operations, should now be functioning as expected. We are monit...

Jun 4, 11:39 AM

identified

Our Engineering team has identified the cause of the issue impacting our DNS API service and is actively working on a fix. During this time, users may experience errors when attempting to create, update, or delete domains and DNS records through the Control Panel and API. As a result, services that depend on DNS API operations, including Let's Encrypt certificate provisioning, MongoDB cluster creation, App Platform deployments, and DigitalOcean Kubernetes (DOKS) cluster create and delete oper...

Jun 4, 11:32 AM

investigating

Our Engineering team continues to investigate an issue impacting our DNS API service. During this time, users may experience issues performing domain and DNS record management operations from the Control Panel and API, including creating, updating, or deleting DNS records. As a result, services that depend on DNS API operations, including Let's Encrypt certificate provisioning, MongoDB cluster creation, App Platform deployments, and DigitalOcean Kubernetes (DOKS) cluster create and delete ope...

Jun 4, 10:59 AM

investigating

Our Engineering team is investigating an issue impacting our DNS API service. During this time, users may experience issues performing domain and DNS record management operations from the Control Panel and API, including creating, updating, or deleting DNS records. As a result, services that rely on DNS API operations, such as Let's Encrypt certificate provisioning and MongoDB cluster creation, may also be impacted. We apologize for the inconvenience and will share an update once we have mor...

Jun 4, 10:12 AM

May 2026

App Platform Deployments

minor

Started: May 29, 12:14 AM

monitoring

Our Engineering team has implemented a fix to resolve the issue with build failures on App Platform. Users should see their builds deploy successfully now. We are closely monitoring the situation, and will post an update once we've confirmed this is fully resolved.

May 29, 1:58 AM

investigating

Our Engineering team is currently investigating an issue with build failures on App Platform in multiple regions. Users may experience errors when attempting to build their applications, resulting in failed deployments. Our Engineering team is working to fix the issue and will share an update once we have more details. We apologize for the inconvenience this issue may be causing and appreciate your patience as we work to resolve it.

May 29, 12:14 AM

Cloud Firewall

minor

Started: May 20, 12:08 PM

monitoring

Our engineering team has implemented a fix that affected the Cloud Firewall rules. Users should now be able to update their firewalls successfully. We are actively monitoring the situation to ensure overall stability. We appreciate your patience and will provide an update once the issue is fully confirmed as resolved.

May 20, 12:43 PM

identified

Our engineering team identified an issue impacting Cloud Firewall Product. During this time, users may notice HTTP 500 errors when updating firewalls. We apologize for the inconvenience and will share an update once more information is available.

May 20, 12:08 PM

Cloud Firewall

minor

Started: May 20, 5:35 AM

monitoring

Our Engineering team has implemented a fix to resolve the issue causing incorrect Cloud Firewall rules to display on the cloud panel. At this time, the user interface is functioning as expected, and we are actively monitoring the situation to ensure continued stability. We will provide a final update once we have verified that the issue is fully resolved.

May 20, 7:15 AM

investigating

We are currently investigating an issue with incorrect Cloud Firewall rules appearing on the cloud panel. Our engineering team is aware of the situation and actively working to identify the root cause. Since this is a user interface issue, users should not experience any problems with other services or networking. We apologize for the inconvenience and appreciate your patience. We will continue to provide updates as we learn more.

May 20, 5:35 AM

Block Storage Volume Performance

minor

Started: May 20, 1:05 AM

monitoring

Our Engineering team has implemented mitigation measures for the infrastructure issue affecting Block Storage Volumes in the NYC3 region. The team is monitoring the situation, and we will share another update once the issue is fully resolved

May 20, 2:19 AM

investigating

Our Engineering team is currently investigating an ongoing infrastructure issue in the NYC3 region affecting Block Storage Volumes at the storage layer. Users may experience degraded write performance and intermittent impact to services dependent on the affected storage infrastructure. We apologize for the inconvenience and will continue to provide updates as more information becomes available.

May 20, 1:05 AM

Anthropic reported outage that's impacting access to their Serverless Inference models

major

Started: May 15, 1:21 AM

identified

According to Anthropic's status page (https://status.claude.com/incidents/8z7l5zcy0v3b), they have identified the root cause and are actively working on a fix. At this time, users may continue to experience errors when attempting to use Sonnet 4.6 and Opus 4.7 models. We will provide another update once Anthropic has implemented a fix and service is restored.

May 15, 1:23 AM

monitoring

As of the current time, our Engineering team is aware of an ongoing incident with our provider, Anthropic, that is impacting Serverless Inference. The outage is affecting all users attempting to use Sonnet 4.6 and Opus 4.7 models. According to Anthropic's status page (https://status.claude.com/incidents/8z7l5zcy0v3b), they are currently experiencing an outage that is causing this disruption. We apologize for the inconvenience and will provide updates as more information becomes available.

May 15, 1:21 AM

DNS service, Certificates and Managed MongoDB

minor

Started: May 14, 1:57 PM

monitoring

Our Engineering team has implemented a fix for the issue affecting our DNS service and are now seeing DNS record updates successfully propagate to the edge. As a result, Let's Encrypt certificate issuance and Managed MongoDB provisioning (including scaling operations) should resume. We are currently monitoring the systems as they process the backlog of delayed requests. Affected MongoDB clusters and certificate requests should complete their provisioning automatically. We will continue to m...

May 14, 4:42 PM

investigating

Our engineering team continues to investigate an issue affecting our DNS service. At this time, DNS resolution remains functional; however, new changes to DNS records are currently not being reflected at the edge. This issue is also impacting related services. Specifically, customers may be unable to create new Let's Encrypt certificates. Regarding Managed MongoDB, customers will be able to submit requests to create new clusters or scale existing ones, but the completion of the provisioning ...

May 14, 3:52 PM

investigating

Our Engineering team is investigating an issue affecting our DNS service. At this time, DNS resolution remains functional but any changes to DNS records are not being reflected at the edge. Additionally, customers may be unable to create new Let's Encrypt certificates at this time. Our engineering team is actively working to identify the root cause and restore full functionality. We apologize for any inconvenience, and we'll share more information as it becomes available.

May 14, 1:57 PM

GradientAI: Agent Platform Playground Interaction Errors

major

Started: May 12, 10:24 AM

identified

A fix has been deployed for the issue affecting the GradientAI Agent Platform Playground, and the service is now operating normally. We are monitoring to ensure stability.

May 12, 11:14 AM

identified

We have identified the cause of the issue affecting the GradientAI Agent Platform Playground. Our engineering team is implementing a fix, and we will provide another update as soon as it is available.

May 12, 10:50 AM

investigating

We are currently investigating an issue affecting the GradientAI Agent Platform Playground. Users may see a “Something went wrong” error for all agent interactions in the Playground. Agent functionality through API endpoints remains unaffected. We are actively working to identify the cause and will provide an update as soon as more information is available.

May 12, 10:24 AM

Control Panel Errors - Unable to Enable 2FA and Google/GitHub

minor

Started: May 9, 11:15 AM

monitoring

Our Engineering team has implemented necessary changes to address the issue affecting the ability to enable Two-Factor Authentication (2FA) and Google/GitHub authentication through the Control Panel. Our team is currently monitoring the situation to ensure stability. We appreciate your patience and will provide an update once the issue is fully confirmed as resolved.

May 9, 1:51 PM

investigating

Our Engineering team is investigating an issue affecting the ability to enable Two-Factor Authentication (2FA) and Google/GitHub authentication through the Control Panel. During this time, users may encounter errors while enabling these authentication methods and could also experience issues accessing teams with secure sign-in enabled. We apologize for the inconvenience and will provide an update as soon as more information becomes available.

May 9, 11:15 AM

Let's Encrypt Outage Affecting Certificate Issuance and Managed Databases Operations

minor

Started: May 8, 8:46 PM

identified

Our Engineering team is aware of an upstream outage with Let's Encrypt (see https://letsencrypt.status.io/) which impacts the following services: - Inability to create new Let's Encrypt certificates for Spaces, Load Balancers, and App Platform Custom Domains - Stuck or delayed creates/forks/restores on Mongo, PG, and MySQL databases. Please note that operations related to Managed Databases and App Platform Custom Domains will automatically retry and should complete successfully once the u...

May 8, 8:46 PM

Multiple Services in NYC2

minor

Started: May 8, 6:01 PM

monitoring

We are continuing to monitor for any further issues.

May 8, 7:53 PM

monitoring

Our Engineering team has identified the issue and implemented a fix to resolve the issues with multiple services, and is monitoring the situation. We will post an update as soon as the issue is fully resolved.

May 8, 7:21 PM

investigating

We are currently investigating an issue affecting multiple services in our NYC2 region. Our engineering team is aware of the situation and is working to identify the root cause and restore full connectivity as quickly as possible. Users with resources in the NYC2 region may experience issues with Droplet connectivity, API requests, or other services. We will provide additional updates as more information becomes available. We apologize for any inconvenience this may cause.

May 8, 6:01 PM

April 2026

Elevated 5xx “context canceled” errors impacting serverless inference

minor

Started: Apr 28, 1:45 PM

monitoring

Service for Serverless Inference has been restored. We’ve implemented tighter rate limits to help prevent recurrence and are closely monitoring system performance. Some users may still experience intermittent latency as we complete final stabilization efforts. Our team remains actively engaged to ensure full recovery. We appreciate your patience and will provide further updates as needed.

Apr 28, 7:00 PM

identified

We have identified an issue affecting our service and are currently working to implement a fix. Our team is actively investigating and taking the necessary steps to restore normal operations as quickly as possible. We appreciate your patience and will provide updates as soon as more information becomes available.

Apr 28, 3:59 PM

investigating

Serverless inference customers are experiencing elevated 5xx errors, including “context canceled” responses. This may result in intermittent request failures. Our team is actively investigating and will provide updates as more information becomes available.

Apr 28, 1:45 PM

Serverless Inference - Intermittent Rate Limiting Affecting Some Customers Using Anthropic Models

major

Started: Apr 27, 10:38 AM

monitoring

We identified the cause of intermittent HTTP 429 responses affecting some customers using Anthropic models on DigitalOcean Serverless Inference and applied a mitigation. Service has recovered, and we are monitoring stability.

Apr 27, 11:07 AM

investigating

We are investigating an issue affecting some customers using DigitalOcean Serverless Inference with Anthropic models. Over the last two hours, impacted customers may have experienced intermittent request failures, including HTTP 429 responses, on some Anthropic model requests. Our engineering team is actively investigating the issue. We apologize for the inconvenience and will share another update as soon as more information is available.

Apr 27, 10:38 AM

Intermittent errors impacting some Serverless Inference models in ATL1

minor

Started: Apr 23, 10:26 PM

investigating

As of 21:53 UTC, our Engineering team is investigating reports of increased internal errors for models Llama 3.3 70B, GPT OSS 120B, GPT OSS 20B, Qwen3 32B and Deepseek R1 70B hosted in the ATL1 region, impacting Serverless Inference. At this point, users with models hosted in ATL1 may experience intermittent errors when using Serverless Inference. We apologize for the inconvenience and will share an update once we have more information.

Apr 23, 10:26 PM

App Platform Deployments

minor

Started: Apr 23, 8:08 AM

monitoring

Our Engineering team has implemented a fix to address the issue causing in App Platform deployments and Kubernetes (DOKS) nodes. We are actively monitoring the situation to ensure overall stability. Users may already notice improvements while deploying apps and DOKS nodes. We appreciate your patience throughout the process and will provide a further update once the issue is fully confirmed to be resolved.

Apr 23, 9:52 AM

investigating

Our Engineering team is currently investigating reports of build failures on App Platform. During this time, some users may encounter errors while building their applications, which may result in failed deployments. In addition, we are observing an issue where Kubernetes (DOKS) nodes are being marked as unhealthy by load balancers, which may impact traffic routing for affected services. Our Engineering team is actively working to resolve these issues and will share an update as soon as more i...

Apr 23, 8:31 AM

investigating

Our Engineering team is currently investigating reports of build failures on App Platform. Users may experience errors when attempting to build their applications, resulting in failed deployments. Our Engineering team is working to fix the issue and will share an update once we have more information. We apologize for the inconvenience this issue may be causing and appreciate your patience as we work to resolve it.

Apr 23, 8:08 AM

Cloud UI for Managed Kubernetes

minor

Started: Apr 23, 7:01 AM

investigating

Our Engineering team is currently investigating an issue impacting the Managed Kubernetes UI across all regions. During this time, users with a Member role may experience the Kubernetes UI page not loading in the cloud console. As a workaround, the DigitalOcean API and doctl (CLI) continue to function normally, and you can use them to manage your Kubernetes resources in the meantime. We apologize for the inconvenience and will share more information as soon as it becomes available

Apr 23, 7:52 AM

investigating

Our Engineering team is currently investigating an issue impacting the Create Managed Kubernetes UI across all regions. During this time, users with a Member role may experience the Kubernetes UI page not loading in the cloud console. As a workaround, the DigitalOcean API and doctl (CLI) continue to function normally, and you can use them to manage your Kubernetes resources in the meantime. We apologize for the inconvenience and will share more information as soon as it becomes available.

Apr 23, 7:14 AM

investigating

Our Engineering team is currently investigating an issue impacting Managed Kubernetes UI across all regions. During this time, some users may experience the Kubernetes UI page not loading in DigitalOcean Onboarding. We apologize for the inconvenience and will share more information as soon as it's available.

Apr 23, 7:01 AM

Cloud Control Panel and API

minor

Started: Apr 22, 11:06 AM

monitoring

Our Engineering team has implemented a fix for the issue affecting the DigitalOcean API and the Cloud Control Panel. We are now actively monitoring the system to ensure full stability and will provide a final update once the issue is completely resolved. We sincerely appreciate your patience and understanding as we work to restore normal service.

Apr 22, 11:25 AM

investigating

Our Engineering team is currently investigating an issue affecting the Cloud Control Panel and API. During this time, API requests to create, destroy, or trigger events on droplets may not succeed. Additionally, the Droplets page in the Cloud Control Panel may not load properly, and users could experience issues while reviewing the droplet listing. We are actively looking into the root cause and will provide updates as soon as more information becomes available. We apologize for the inconve...

Apr 22, 11:06 AM

DNS Resolution for .co TLD

minor

Started: Apr 17, 9:22 PM

monitoring

Our Engineering team has implemented a fix to resolve the issue that was affecting the DNS resolution of .co top-level domain (TLD). During this time, users who are using DigitalOcean DNS resolvers in their resources should no longer experience issues related to the DNS resolution. Our Engineers are currently monitoring the situation. We will post an update as soon as the issue is fully resolved.

Apr 17, 10:53 PM

identified

Our Engineering team is aware of a widespread, external issue affecting .co top-level domain (TLD). While this incident originates outside of DigitalOcean's infrastructure, you may experience errors when querying a .co domain, regardless of the DNS resolver being used. Our Engineers are actively deploying temporary backend mitigations to help minimize the impact on our customers. We will continue to monitor the situation closely and post updates as more information becomes available.

Apr 17, 9:22 PM

Serverless Inference

minor

Started: Apr 15, 1:07 AM

monitoring

Our Engineering team has implemented a fix for the issue causing elevated error rates due to service instability. We are currently monitoring the situation to ensure stability and confirm that error rates, including HTTP 500 responses, have returned to normal levels. We will provide a further update once we confirm the issue is fully resolved.

Apr 15, 2:24 AM

investigating

Our Engineering team is investigating an issue causing elevated error rates due to service instability and terminating open connections which causes some 500s. Some requests may fail while we work to resolve it. We apologize for the inconvenience and will share an update once we have more information.

Apr 15, 1:07 AM

Managed Database Resizes

minor

Started: Apr 14, 12:53 PM

monitoring

Our Engineering team has taken action to mitigate the issue with resize operations for Managed Databases and implemented a fix . We are monitoring the situation and will post an update as soon as we confirm that the issue is fully resolved.

Apr 14, 2:30 PM

investigating

Our engineering team is investigating an issue impacting resize operations for Managed Databases. During this time, users may experience error when attempting to resize Managed Database via Cloud Control Panel and API in all regions. We apologize for the inconvenience and will share an update once we have more information.

Apr 14, 12:53 PM

Droplet Availability in All Regions

Started: Apr 10, 8:32 PM

monitoring

Subject: Droplet Availability in All Regions Our Engineering team has identifed an issue with Droplet creates in all regions. A root cause has been found, a fix has been put in place and we are currently monitoring the situation to ensure full resolution. Users should be able to create new Droplets at this time. We will continue to monitor and we will post an update as soon as it is fully resolved. We apologize for the inconvenience.

Apr 10, 8:32 PM

Serverless Inference - High error rates for open source models ( Qwen 3 32B)

minor

Started: Apr 7, 12:49 PM

identified

We are currently investigating reports of elevated latency affecting requests to this model when using Serverless Inference and Agents. Earlier observations indicated increased error rates for the open-source Qwen 3 32B model. The Ray dashboard also showed multiple workers in a pending state, suggesting capacity constraints. Our analysis determined that the model was experiencing higher-than-expected request volume without sufficient resources to scale accordingly. To address this, the node...

Apr 7, 12:55 PM

investigating

Serverless inference for alibaba-qwen3-32b (Qwen 3 32B) in tor1 is experiencing high error rates starting at 10:46 UTC.

Apr 7, 12:49 PM

Serverless Inference Issue

minor

Started: Apr 6, 12:28 PM

monitoring

A fix has been implemented and we are monitoring the results.

Apr 6, 3:15 PM

investigating

Our Engineering team is investigating an issue with Serverless inference. At this time, users may experience high error rates for open source models (llama 3.3 70b). We apologize for the inconvenience and will share an update once we have more information.

Apr 6, 12:28 PM

March 2026

App platform seeing delays in deployments across FRA1 region

minor

Started: Mar 20, 10:32 AM

monitoring

Our Engineering team has deployed a fix to resolve the issue impacting new App Platform deployments using Dedicated Egress IP in FRA1 region. We are actively monitoring the situation to ensure stability and will provide an update once the incident has been fully resolved. Thank you for your patience and we apologize for the inconvenience.

Mar 20, 11:14 AM

investigating

Our engineers are currently investigating an issue impacting new App Platform deployments using Dedicated Egress IP in FRA1 region. During this time, some users may experience delay when creating new App Platform apps or deploying existing apps. Existing apps are not affected and should continue to function normally. We apologize for any inconvenience, and we'll share more information as it becomes available.

Mar 20, 10:32 AM

Gradient AI Platform agents and services Accessibility

major

Started: Mar 20, 8:50 AM

monitoring

A fix has been implemented and services have been restored. We are continuing to monitor the system to ensure stability. We will provide further updates if needed.

Mar 20, 2:04 PM

identified

We've identified the issue and are actively working to restore the affected services. We're making steady progress and closely monitoring the situation. Further updates will be shared as they become available.

Mar 20, 11:05 AM

identified

We’ve identified the issue and are currently working on restoring the services. We’ll continue to provide updates as progress is made.

Mar 20, 9:51 AM

investigating

We are currently investigating issue affecting the accessibility of agents and services on the Gradient AI Platform. Users may experience failures or unresponsiveness when attempting to use these features. Our engineering team is actively working to identify the root cause and restore full functionality. We apologize for the inconvenience and will share an update once we have more information.

Mar 20, 8:50 AM

Gradient AI model availability

minor

Started: Mar 17, 3:00 PM

investigating

Our Engineering team is investigating reports of Gradient AI model availability issues impacting multiple models. Users may experience issues with models availability, including Llama3.1-8b and Qwen3-32b, as well as embedding models such as GTE Large (v1.5), All-MiniLM-L6-v2, Multi-QA-mpnet-base-dot-v1, and Qwen3 Embedding 0.6B. Additionally, Guardrails are not available, affecting associated agents, and users attempting to run inference on the Llama3.3-70b model will see degraded performan...

Mar 17, 3:00 PM

Degraded performance with BYOK Anthropic models

minor

Started: Mar 15, 2:55 AM

investigating

Our Engineering team is investigating an issue related to all Gradient AI agents and serverless inference that require BYOK Anthropic modles. Impacted users may experience degraded performance. We will provide an update as soon as possible

Mar 15, 2:55 AM

Delay in App Platform Deployments

minor

Started: Mar 13, 10:01 PM

monitoring

After working with our upstream provider, our Engineering team has implemented a fix to resolve the issue that was causing delays in the deployment of new apps, and they are currently monitoring the situation. During this time, users should no longer experience issues with creating new apps and all the stalled creation events should provision completely. We will post an update as soon as the issue is fully resolved.

Mar 13, 11:39 PM

identified

Our Engineering team is starting to see delays once again with new App Platform deployments. During this time, users may still experience delays with deploying new apps. We're working with our upstream provider to resolve the issue. We again apologize for the inconvenience. We will post further updates once we have more information.

Mar 13, 10:01 PM

monitoring

Starting at 20:40 UTC, users may have seen delays with deploying new apps on App Platform. At this time, our Engineering team is seeing signs of recovery, and users should be able to deploy new apps without issue. We're currently monitoring the situation to ensure full recovery. We apologize for the inconvenience. We'll post an update once the issue has been confirmed to be resolved.

Mar 13, 9:30 PM

Newly Created Managed Kubernetes Nodes

minor

Started: Mar 13, 11:26 AM

monitoring

Our Engineering team has implemented a fix to address the issue causing DNS timeouts for newly provisioned Managed Kubernetes nodes. Further investigation has confirmed that this issue primarily affected customers utilizing a NAT Gateway within their VPC and running a VPC-native cluster. We are actively monitoring the situation to ensure overall stability. We appreciate your patience and will provide a further update once the issue is fully confirmed to be resolved.

Mar 13, 1:55 PM

identified

Our Engineering team is investigating an issue impacting newly provisioned Managed Kubernetes nodes. During this time, Only customers who run a NAT Gateway in their VPC and a VPC-native clusters are affected and may experience DNS timeouts. We apologize for the inconvenience and will share an update once we have more information.

Mar 13, 12:32 PM

investigating

Our Engineering team is investigating an issue impacting newly provisioned Managed Kubernetes nodes. During this time, new nodes may experience DNS timeouts, which could temporarily affect cluster services. We apologize for the inconvenience and will share an update once we have more information.

Mar 13, 11:26 AM

View live status for DigitalOcean Browse all services