Observe Outage History
Uptime record, past incidents, and downtime history for Observe.
Checking current status...
90-Day Trend
Monthly Uptime
| Month | Uptime | Days Tracked | Days with Issues |
|---|---|---|---|
| May 2026 | 28% | 25 | 18 |
| April 2026 | 100% | 30 | 0 |
| March 2026 | 90.3% | 31 | 3 |
| February 2026 | 100% | 5 | 0 |
Uptime is calculated from daily worst-status snapshots. A day with any non-operational status counts as a day with issues.
Daily Status (Last 91 Days)
Feb 24
Today
Operational
Degraded
Partial Outage
Major Outage
Maintenance
No Data
Incident History
May 2026
2026-05-13 - Monitor Evaluation Delays - US West (Oregon)
Started: May 14, 3:30 AM
monitoring
A fix has been implemented and we are monitoring the results
May 14, 6:12 AM
investigating
We are continuing to investigate the issue.
May 14, 5:04 AM
investigating
We are currently investigating an issue causing monitor evaluation delays in US West (Oregon). Monitor alerts may not fire normally. No data loss has occurred.
May 14, 3:30 AM
2026-05-12 Monitor emails not sending
Started: May 13, 1:40 AM
monitoring
A fix has been implemented and we are monitoring the results.
May 13, 3:45 PM
identified
The issue with outbound email delivery has recurred. Our team has re-engaged and is actively working on a resolution.
May 13, 9:36 AM
monitoring
A fix has been implemented and we are monitoring to confirm full recovery.
Email delivery has been restored for new notifications going forward. Email alerts and scheduled reports triggered during this incident will not be delivered.
We will provide a final update once we have confirmed the service is fully restored.
May 13, 5:33 AM
identified
We have identified the root cause of monitor email delivery failures affecting all Observe production regions and are working on a resolution.
May 13, 3:12 AM
investigating
We are investigating an issue affecting outbound email delivery from the Observe platform in all production regions. Monitor alert notifications, scheduled report deliveries, and other product emails are not being sent. Our team is actively investigating the root cause and working to restore normal performance as quickly as possible. We'll provide updates as we learn more.
May 13, 1:40 AM
2026-05-12 — Security Notification — Email Incident (All Regions)
Started: May 12, 5:27 PM
monitoring
Our investigation has concluded. The root cause was identified as a free trial tenant abusing a test email endpoint.
The offending tenant has been disabled and we are deploying additional controls to prevent this type of misuse in the future. No customer data was compromised and the impact was limited to the unauthorized emails sent from this single vector.
We apologize for the inconvenience and appreciate your patience.
May 12, 10:20 PM
monitoring
A fix has been implemented and we are monitoring the results.
May 12, 6:08 PM
investigating
We recently identified a security issue involving an unauthorized email sent from our domain. We are aware of this matter and are treating it with the highest priority. Our investigation is ongoing, and at this time we believe the scope of this incident was limited to a single vector, which has since been disabled. We have no indication of broader impact. We will provide an update as our investigation progresses. In the meantime, if you received any suspicious emails purporting to be from us,...
May 12, 5:27 PM
2026-05-08 AI SRE Query Failures (US-WEST)
Started: May 8, 2:46 PM
monitoring
A fix has been implemented and we are monitoring the results.
May 8, 3:20 PM
investigating
We are continuing to investigate this issue.
May 8, 3:07 PM
investigating
Some customers may experience failed or unresponsive AI SRE queries. Affected users may see errors or timeouts when submitting queries. The team is actively investigating.
May 8, 2:46 PM
March 2026
2026-03-25 High Rate of Ingest Errors (US West)
Started: Mar 26, 1:25 AM
monitoring
A fix has been implemented and we are monitoring the results.
Mar 26, 1:28 AM
investigating
We’ve identified an issue causing elevated errors for ingest endpoints in US West region. As a result, data ingest will be affected. Our team is actively investigating the root cause and working to restore normal performance as quickly as possible. We’ll provide updates as we learn more. Thank you for your patience!
Mar 26, 1:25 AM
2026-03-20 Intermittent ingest errors and query instability
Started: Mar 21, 6:21 AM
monitoring
A fix has been applied and the system is recovering.
Mar 21, 6:33 AM
identified
The issue has been identified by Snowflake engineering teams and a mitigation is being applied.
Mar 21, 6:23 AM
investigating
We’ve identified an issue causing data ingest to have higher than normal errors and queries to be unreliable in the following regions: US-WEST-2. As a result, some users may experience higher than normal ingest errors and slow queries. We do not anticipate any data loss. This incident is related to an ongoing Snowflake incident - https://status.snowflake.com/. Our team is actively investigating the root cause and working to restore normal performance as quickly as possible. We’ll provi...
Mar 21, 6:21 AM
Unable to access Observe Tenant
Started: Mar 19, 12:20 AM
monitoring
We are aware that some users may be unable to log into Observe. The issue has been mitigated and we're monitoring.
Mar 19, 12:20 AM
Ingest performance degredation in Prod EU cluster
Started: Mar 16, 6:55 PM
monitoring
A large increase in tracing data caused ingestion lag on the prod-eu-1 cluster, affecting multiple customers. Encoder replicas were scaled up and subsequently rolled to resolve unassigned partition workers. Memory pressure was identified as a contributing factor. Ingestion lag peaked at up to 45 minutes for some customers and has since recovered to near-normal levels. The team is continuing to monitor.
Mar 16, 6:55 PM
Service Degradation
Started: Mar 16, 6:44 PM
investigating
A large increase in tracing data caused ingestion lag on the prod-eu-1 cluster. Encoder replicas were scaled up, but some partition workers were not reading data due to unassigned partitions. Encoders were rolled, and partition lag is now decreasing. The team is continuing to monitor recovery.
Mar 16, 6:44 PM
Performance Degradation in GCP
Started: Mar 12, 2:51 PM
monitoring
The issue with elevated warehouse resume times in GCP has been mitigated. Query queueing times have returned to normal levels and resume times have stabilized. A case was raised with the infrastructure provider and warehouse resource management was adjusted to reduce customer impact during the incident. We are continuing to monitor to confirm the issue is fully resolved.
Mar 12, 5:15 PM
monitoring
We are continuing to monitor for any further issues.
Mar 12, 2:52 PM
monitoring
Our internal team identified an issue with warehouse resume times in Prod GCP - causing p99 queueing times of up to 2m (including user queries). The issue is believed to be addressed and we are monitoring on our side.
Mar 12, 2:51 PM