New AI Incident Response, Multi-Region Agents, and Custom-Domain Status Pages — May 2026
Services Pricing Dashboard

Observe Outage History

Uptime record, past incidents, and downtime history for Observe.

Checking current status...
76.9% uptime over 91 days
99.9% ✗ 99.5% ✗ 99% ✗ 95% ✗

90-Day Trend

Feb 25May 25

Monthly Uptime

Month Uptime Days Tracked Days with Issues
May 2026 28% 25 18
April 2026 100% 30 0
March 2026 90.3% 31 3
February 2026 100% 5 0

Uptime is calculated from daily worst-status snapshots. A day with any non-operational status counts as a day with issues.

Daily Status (Last 91 Days)

Feb 24 Today
Operational Degraded Partial Outage Major Outage Maintenance No Data

Incident History

May 2026
2026-05-13 - Monitor Evaluation Delays - US West (Oregon)
major

Started: May 14, 3:30 AM

monitoring
A fix has been implemented and we are monitoring the results
May 14, 6:12 AM
investigating
We are continuing to investigate the issue.
May 14, 5:04 AM
investigating
We are currently investigating an issue causing monitor evaluation delays in US West (Oregon). Monitor alerts may not fire normally. No data loss has occurred.
May 14, 3:30 AM
2026-05-12 Monitor emails not sending

Started: May 13, 1:40 AM

monitoring
A fix has been implemented and we are monitoring the results.
May 13, 3:45 PM
identified
The issue with outbound email delivery has recurred. Our team has re-engaged and is actively working on a resolution.
May 13, 9:36 AM
monitoring
A fix has been implemented and we are monitoring to confirm full recovery. Email delivery has been restored for new notifications going forward. Email alerts and scheduled reports triggered during this incident will not be delivered. We will provide a final update once we have confirmed the service is fully restored.
May 13, 5:33 AM
identified
We have identified the root cause of monitor email delivery failures affecting all Observe production regions and are working on a resolution.
May 13, 3:12 AM
investigating
We are investigating an issue affecting outbound email delivery from the Observe platform in all production regions. Monitor alert notifications, scheduled report deliveries, and other product emails are not being sent. Our team is actively investigating the root cause and working to restore normal performance as quickly as possible. We'll provide updates as we learn more.
May 13, 1:40 AM
2026-05-12 — Security Notification — Email Incident (All Regions)
major

Started: May 12, 5:27 PM

monitoring
Our investigation has concluded. The root cause was identified as a free trial tenant abusing a test email endpoint. The offending tenant has been disabled and we are deploying additional controls to prevent this type of misuse in the future. No customer data was compromised and the impact was limited to the unauthorized emails sent from this single vector. We apologize for the inconvenience and appreciate your patience.
May 12, 10:20 PM
monitoring
A fix has been implemented and we are monitoring the results.
May 12, 6:08 PM
investigating
We recently identified a security issue involving an unauthorized email sent from our domain. We are aware of this matter and are treating it with the highest priority. Our investigation is ongoing, and at this time we believe the scope of this incident was limited to a single vector, which has since been disabled. We have no indication of broader impact. We will provide an update as our investigation progresses. In the meantime, if you received any suspicious emails purporting to be from us,...
May 12, 5:27 PM
2026-05-08 AI SRE Query Failures (US-WEST)
major

Started: May 8, 2:46 PM

monitoring
A fix has been implemented and we are monitoring the results.
May 8, 3:20 PM
investigating
We are continuing to investigate this issue.
May 8, 3:07 PM
investigating
Some customers may experience failed or unresponsive AI SRE queries. Affected users may see errors or timeouts when submitting queries. The team is actively investigating.
May 8, 2:46 PM
March 2026
2026-03-25 High Rate of Ingest Errors (US West)

Started: Mar 26, 1:25 AM

monitoring
A fix has been implemented and we are monitoring the results.
Mar 26, 1:28 AM
investigating
We’ve identified an issue causing elevated errors for ingest endpoints in US West region. As a result, data ingest will be affected. Our team is actively investigating the root cause and working to restore normal performance as quickly as possible. We’ll provide updates as we learn more. Thank you for your patience!
Mar 26, 1:25 AM
2026-03-20 Intermittent ingest errors and query instability
critical

Started: Mar 21, 6:21 AM

monitoring
A fix has been applied and the system is recovering.
Mar 21, 6:33 AM
identified
The issue has been identified by Snowflake engineering teams and a mitigation is being applied.
Mar 21, 6:23 AM
investigating
We’ve identified an issue causing data ingest to have higher than normal errors and queries to be unreliable in the following regions: US-WEST-2. As a result, some users may experience higher than normal ingest errors and slow queries. We do not anticipate any data loss. This incident is related to an ongoing Snowflake incident - https://status.snowflake.com/. Our team is actively investigating the root cause and working to restore normal performance as quickly as possible. We’ll provi...
Mar 21, 6:21 AM
Unable to access Observe Tenant

Started: Mar 19, 12:20 AM

monitoring
We are aware that some users may be unable to log into Observe. The issue has been mitigated and we're monitoring.
Mar 19, 12:20 AM
Ingest performance degredation in Prod EU cluster
minor

Started: Mar 16, 6:55 PM

monitoring
A large increase in tracing data caused ingestion lag on the prod-eu-1 cluster, affecting multiple customers. Encoder replicas were scaled up and subsequently rolled to resolve unassigned partition workers. Memory pressure was identified as a contributing factor. Ingestion lag peaked at up to 45 minutes for some customers and has since recovered to near-normal levels. The team is continuing to monitor.
Mar 16, 6:55 PM
Service Degradation
minor

Started: Mar 16, 6:44 PM

investigating
A large increase in tracing data caused ingestion lag on the prod-eu-1 cluster. Encoder replicas were scaled up, but some partition workers were not reading data due to unassigned partitions. Encoders were rolled, and partition lag is now decreasing. The team is continuing to monitor recovery.
Mar 16, 6:44 PM
Performance Degradation in GCP
minor

Started: Mar 12, 2:51 PM

monitoring
The issue with elevated warehouse resume times in GCP has been mitigated. Query queueing times have returned to normal levels and resume times have stabilized. A case was raised with the infrastructure provider and warehouse resource management was adjusted to reduce customer impact during the incident. We are continuing to monitor to confirm the issue is fully resolved.
Mar 12, 5:15 PM
monitoring
We are continuing to monitor for any further issues.
Mar 12, 2:52 PM
monitoring
Our internal team identified an issue with warehouse resume times in Prod GCP - causing p99 queueing times of up to 2m (including user queries). The issue is believed to be addressed and we are monitoring on our side.
Mar 12, 2:51 PM