Skip to main content
CodeLint.Dev Dev Tools
Developer Tools 10 min read

Webhooks Explained: The Complete Guide to Security, Retries, and Debugging

Webhooks power the invisible plumbing of the modern web: Stripe telling your server a payment succeeded, GitHub triggering your CI on every push, Shopify announcing a new order. The concept is trivially simple — an HTTP POST when something happens — yet production webhook handlers are a minefield of subtle bugs: unverified payloads, duplicate deliveries charging customers twice, out-of-order events corrupting state, and 30-second handlers silently dropping events. This guide covers how webhooks work, how to verify signatures properly, and the patterns (idempotency, fast-ack, replay protection) that separate toy handlers from production ones.

Try the tool
Webhook Tester
Test your webhook endpoint with signed requests →

Webhooks vs Polling: Why Push Won

There are two ways to learn that something happened in another system. Polling: you ask repeatedly — "any new orders?" every 30 seconds, all day, mostly hearing "no". Webhooks: you register a URL once, and the other system POSTs to it the moment an event occurs.

The trade-off is stark. Polling every 30 seconds means up to 2,880 requests per day per resource — nearly all wasted — with up to 30 seconds of latency on every event. A webhook is one request per actual event, delivered within seconds. For any event-driven integration (payments, CI, messaging, order fulfillment), push wins on latency, cost, and rate-limit budget simultaneously.

A typical webhook delivery looks like this:

POST /webhooks/stripe HTTP/1.1
Host: api.yourapp.com
Content-Type: application/json
Stripe-Signature: t=1720080000,v1=5257a869e7ecebeda32affa62cdca3fa51cad7e77a0e56ff536d0ce8e108d8bd

{
  "id": "evt_1PXk2j2eZvKYlo2C",
  "type": "payment_intent.succeeded",
  "data": { "object": { "id": "pi_3PXk...", "amount": 4999, "currency": "usd" } }
}

But webhooks flip the client/server relationship, and that flip creates every problem in the rest of this guide: your endpoint is now a public URL that anyone on the internet can POST to, receiving events you did not request, possibly duplicated, possibly out of order, from a sender who will give up on you if you respond too slowly.

Security Rule #1: Verify Signatures — Correctly

An unverified webhook endpoint is an open door: anyone who discovers the URL can forge a "payment succeeded" event and get free product. Every serious provider signs its deliveries, almost universally with HMAC-SHA256: the provider computes a keyed hash of the raw request body using a shared secret and sends it in a header (Stripe-Signature, X-Hub-Signature-256 for GitHub, X-Shopify-Hmac-Sha256…). Your handler recomputes the HMAC and compares.

A correct Node.js verification (GitHub-style):

import crypto from 'crypto';

function verifySignature(rawBody, signatureHeader, secret) {
  const expected = 'sha256=' +
    crypto.createHmac('sha256', secret).update(rawBody).digest('hex');
  // timing-safe comparison — never use ===
  return expected.length === signatureHeader.length &&
    crypto.timingSafeEqual(Buffer.from(expected), Buffer.from(signatureHeader));
}

The four mistakes that appear in almost every broken implementation:

  • Verifying a re-serialized body. HMAC is computed over the exact raw bytes. If your framework parses JSON and you verify against JSON.stringify(req.body), key reordering or whitespace differences break verification randomly. Capture the raw body before any middleware parses it.
  • Using == or === to compare signatures. String comparison short-circuits at the first differing character, leaking timing information that lets attackers reconstruct a valid signature byte by byte. Always use a constant-time comparison (crypto.timingSafeEqual, hmac.compare_digest in Python).
  • Ignoring the timestamp. Providers like Stripe include a timestamp in the signed payload precisely so you can reject old deliveries (commonly older than 5 minutes). Without that check, a captured request can be replayed later — a valid signature is not proof of freshness.
  • Confusing HTTPS with authentication. TLS encrypts the transport; it says nothing about who sent the request. An HTTPS endpoint with no signature check is still an open door.

Rotate webhook secrets periodically, store them like any credential (not in code), and if the provider offers it, also pin the event types you accept.

Delivery Semantics: Retries, Duplicates, and Ordering

Webhook providers promise at-least-once delivery — never exactly-once. When your endpoint times out or returns 5xx, the provider retries with exponential backoff (Stripe retries for up to 3 days; GitHub, Shopify and others have similar schedules). Three consequences follow, and each demands a pattern:

1. You will receive duplicates → be idempotent

A retry after a timeout your handler actually completed means the same event arrives twice. If the handler ships an order per event, someone gets two packages. The fix: every event carries a unique ID — record processed IDs and skip repeats:

-- atomic idempotency guard
INSERT INTO processed_events (event_id) VALUES ($1)
ON CONFLICT (event_id) DO NOTHING;
-- if no row was inserted, this event was already handled: return 200 and stop

2. Events arrive out of order → trust the source, not the sequence

Retries and parallel delivery mean "subscription.updated" can arrive before "subscription.created". Never build state by folding events in arrival order. The robust pattern: treat the webhook as a notification that something changed, then fetch the current state from the provider's API (or use the full object embedded in the event, comparing timestamps or version numbers before overwriting newer local state).

3. Slow handlers get dropped → acknowledge fast, process async

Providers typically time out after 10–30 seconds; consistently slow endpoints get suspended. The production pattern is fast-ack: verify the signature, persist the raw event to a queue or table, return 200 immediately — then process from the queue with your own retry policy. Your response time becomes milliseconds regardless of how heavy the real work is, and a bug in processing no longer causes redelivery storms.

Debugging Webhooks Without Losing Your Mind

Webhooks are miserable to debug precisely because the client is someone else's server calling an endpoint that must be publicly reachable. The standard toolkit:

  • Capture before you code. Point the provider at a request-capture endpoint first and look at real deliveries — exact headers, exact body, exact content type — before writing a line of handler code. Payloads routinely differ from documentation.
  • Tunnel to localhost. Tools like ngrok, Cloudflare Tunnel, or localtunnel give your dev machine a public HTTPS URL so providers can reach your local handler while you set breakpoints.
  • Replay from the provider dashboard. Stripe, GitHub, and Shopify all let you view delivery history — request, response, and status — and redeliver any event with one click. This is the single most useful debugging surface; check it before adding logging.
  • Use provider CLIs. The Stripe CLI (stripe listen, stripe trigger) forwards live events to localhost and fires synthetic test events on demand — no tunnel needed. GitHub's CLI can redeliver hook payloads similarly.
  • Send test requests yourself. To test signature verification and error paths, send crafted POSTs with valid and deliberately invalid HMAC signatures at your endpoint and confirm it accepts the former and rejects the latter — a webhook tester that supports custom headers and HMAC signing does this in seconds.
  • Log the failures, not just the successes. Persist every rejected delivery (bad signature, unknown event type, processing error) with the raw body. When a provider changes payload format — it happens — the rejects log is how you find out before customers do.

The Production Hardening Checklist

Everything above, condensed into the list worth pinning next to your handler code:

  • ✅ Verify HMAC signatures on the raw body with a timing-safe comparison; reject anything unsigned or stale (timestamp older than ~5 minutes).
  • ✅ Return 2xx fast (< 1s): verify, enqueue, acknowledge. Do the real work asynchronously with your own retries and a dead-letter queue.
  • ✅ Be idempotent: dedupe on the event ID atomically; design every side effect to be safe to attempt twice.
  • ✅ Tolerate disorder: never assume arrival order; fetch current state or compare object versions before overwriting.
  • ✅ Validate before trusting: check the event type is one you expect, parse defensively, and treat payload fields as untrusted input (they are).
  • ✅ Return the right codes: 2xx for handled (including duplicates and events you deliberately ignore — do not make the provider retry those), 4xx for permanently invalid, 5xx only when a retry might genuinely succeed.
  • ✅ Monitor the endpoint: alert on signature-failure spikes (attack or secret rotation gone wrong), on delivery-failure rates from the provider dashboard, and on queue depth.
  • ✅ Plan for outages: after downtime longer than the provider's retry window, reconcile by listing recent events/objects from the provider's API — webhooks are a latency optimization, polling is the backstop.

Handlers built this way are boring — they shrug off duplicates, replay attacks, provider outages, and payload changes. In webhook engineering, boring is the whole goal.

Frequently Asked Questions

What is a webhook in simple terms?
A webhook is a way for one system to notify another the moment something happens, by sending an HTTP POST request to a URL you registered. Instead of your server repeatedly asking "anything new?" (polling), the provider pushes the event to you: Stripe POSTs when a payment succeeds, GitHub POSTs when code is pushed. It is often described as a reverse API call — the server calls you.
What is the difference between a webhook and an API?
Direction and initiative. With an API, your code initiates a request when it wants data and gets a synchronous response. With a webhook, the other system initiates the request when an event occurs, and your code passively receives it. They are complements: most integrations use webhooks to learn that something changed, then call the API to fetch authoritative current state or take follow-up actions.
How do I secure a webhook endpoint?
Four layers: (1) verify the provider's HMAC signature on the exact raw request body using a timing-safe comparison — this is the non-negotiable core; (2) reject stale deliveries by checking the signed timestamp to prevent replay attacks; (3) serve HTTPS only, and treat the payload as untrusted input; (4) keep the webhook secret out of source code and rotate it periodically. IP allowlisting is an optional extra layer where the provider publishes stable ranges, but signatures are the real authentication.
Why do I receive the same webhook event twice?
Because providers guarantee at-least-once delivery, not exactly-once. If your endpoint times out or returns an error — even after your code actually finished the work — the provider retries, producing a duplicate. This is normal and unavoidable, so handlers must be idempotent: record each event's unique ID atomically (e.g. an INSERT with ON CONFLICT DO NOTHING) and skip events you have already processed, returning 200 for the duplicate.
How do I test webhooks on localhost?
Three approaches: (1) tunneling tools like ngrok or Cloudflare Tunnel expose your local server on a public HTTPS URL that providers can reach; (2) provider CLIs — notably the Stripe CLI's listen and trigger commands — forward live events to localhost and fire synthetic test events without any tunnel; (3) a webhook testing tool that sends crafted POST requests with custom headers and HMAC signatures lets you verify your signature validation and error handling without involving the provider at all. Use provider dashboards to replay real past deliveries against your endpoint.
What HTTP status code should my webhook handler return?
Return 2xx (usually 200) quickly — within a few seconds — for anything you accepted, including duplicates and event types you deliberately ignore; otherwise the provider keeps retrying. Return 4xx for requests that are permanently invalid (bad signature, malformed payload) since retrying cannot fix them. Reserve 5xx for genuine transient failures where you want a retry. Never do heavy processing before responding: acknowledge fast, queue the work, and process asynchronously.

Ready to try Webhook Tester?

Free, private, and runs entirely in your browser — no sign-up, no server, no data sent anywhere.

Open Webhook Tester