Uptime & Synthetic Checks

Multi-region uptime monitoring with browser-step transactions and AI triage in Slack.

Drop a Go agent on any host for quorum-based alerts. Run real Chrome transactions for the flows that matter. Type @servicealert investigate when something breaks and get ranked root-cause hypotheses with a code-pointer back to your repo.

Start free View live demo

Monitor types

30 s

Min check interval

Alert channels

$249

Flat / month

Monitor types

Nine probe shapes, one dashboard

Cover every dependency from a single workspace. No per-monitor pricing, no per-channel surcharges.

HTTP / HTTPS

Status codes, response time, keyword match, custom headers, mTLS + OAuth2 client-credentials.

Ping (ICMP)

Reachability + RTT for hosts that don’t expose HTTP. Hairpin-NAT guard prevents agent self-checks.

TCP

Port-level connectivity for databases, mail relays, SSH, anything plain-TCP that needs a heartbeat.

DNS

A / AAAA / MX / NS / TXT / CNAME records with multi-resolver propagation diff and DNSSEC validation.

SSL / TLS

Expiry windows, chain validation, cipher posture, OCSP/CRL revocation. Renewal alerts at 30/14/7/3 days.

Domain expiry

WHOIS-backed renewal warnings 90 / 60 / 30 / 14 / 7 days out, before the registrar drops your domain.

Heartbeat / cron

Passive check — your job pings us; we alert if a beat is missed. Includes a unique signed token per monitor.

JSON Query

Fetch any JSON endpoint, assert on a JSONPath. $.healthy == true without a custom script.

Browser-step (Playwright)

Multi-step Chrome sessions. Login → cart → checkout flows with screenshots at every step.

Multi-region

Probes from anywhere — including your own infrastructure

ServiceAlert ships its own probe network and lets you add yours. A single Go binary registers with your account and joins the monitor pool as another check-running region.

Quorum-based alerting cuts single-region flakes

Per-monitor quorum: require any, majority, or all configured regions to agree before paging. A carrier blip in ap-south-1 no longer wakes you up at 3am when every other region sees the target as healthy.

Drop the agent on any host — AWS region, on-prem datacenter, customer VPC, branch office
Region pills on every monitor show last-check latency from each source
systemd unit with proper Linux capabilities — ICMP and TCP without root
Hairpin-NAT guard skips checks where the target resolves to the agent itself

api.acme.com · majority quorum

us-east (Azure)112 ms2s ago

eu-west (yours)38 ms3s ago

ap-south (yours)timeout4s ago

sa-east (Azure)241 ms2s ago

Browser-step transactions

Real Chrome for the flows that matter

Lightweight HTTP transactions cover most journeys, but some flows need real JavaScript, real cookies, and real screenshots. ServiceAlert runs both, and falls back automatically.

Login → cart → checkout, recorded

The browser worker is a long-running daemon driving system Chrome via a unix socket, so memory stays bounded and step latencies are consistent across runs. Per-step assertions on text, status code, or response body. Screenshots attached to every failure.

Multi-step user flows your synthetic tests can’t fake
Per-step screenshots attached to failures — 3am pages come with evidence
Authenticated cookies, custom headers, OAuth flows that touch real backends
Real-browser checks with screenshots and per-step assertions

Acme checkout flow · passed

✓Open /login412 ms

✓Fill email + password87 ms

✓Click “Sign in”1.8 s

✓Add to cart340 ms

✓Assert “1 item in cart”12 ms

✓Click “Checkout”920 ms

✓Assert URL == /success8 ms

AI triage

Slack-native investigation in seconds

When a monitor goes down, your team has questions. ServiceAlert answers them.

Type `@servicealert investigate`

The agent posts a threaded reply with ranked root-cause hypotheses, the actual log lines that triggered the call, the most-recent deploy that touched the affected service, and a one-click code-pointer link to the suspect file in your GitHub repo.

Hypotheses ranked by confidence, calibrated weekly against past outcomes
Reactions act as actions: thumbs-up promotes, X silences a flapper, siren declares an incident
Auto-silence detector kills flapping monitors before they wake your team up
Forensic capture on every incident: screenshot, MTR trace, response body

@you  @servicealert investigate

@servicealert  Investigating “Checkout API” (down 4m)

  Hypotheses
   1. High  Stripe webhook timeout (was healthy 6m ago)
   2. Med   Recent deploy at 14:02 touched billing/
   3. Low   DNS propagation lag (resolver diff: 0)

  Logs (3)  ·  Deploys (1)  ·  Code
  → api/billing/stripe.php:142 in commit 3a4f9b1

  React 👍 to promote, ❌ to silence, 🚨 to declare

Ship better uptime monitoring this afternoon.

Free tier covers 10 monitors. Business at $249/mo flat (unlimited users) replaces Pingdom + PagerDuty + the Slack-AI bolt-on you were going to buy.

Start free See what it replaces