Published on Fri Sep 19 2025 00:00:00 GMT+0000 (Coordinated Universal Time) by Orkid Labs
TL;DR
This week we shipped measurable improvements and cleaned up the experience end‑to‑end. The detector now processes candidates in the 50–168ms band (conservatively ≤200ms at the 97.5th percentile) according to Keen analytics. The API path is predictably fast: around ~24ms median within our EU datacenter and ~151ms median for a US→EU path from San Antonio to our DE server based on fresh autocannon runs. We also gave the live feed its own page, scoped the dashboard under the ORKID product, and made our telemetry both honest and useful by adding sampling, per‑pair TTL dedupe, and explicit labels when a simulated mode is shown.
What we shipped this week
We focused on three themes: performance truth, operational clarity, and product coherence.
First, performance truth. We updated site copy and metadata to reflect the numbers we actually observe in production. Detector latency is tracked in Keen as a first‑class metric and consistently lands between 50–168ms, with a conservative ≤200ms at the 97.5th percentile. For the API, we used autocannon with a 20‑second, 20‑connection profile against the /health endpoint. From the EU datacenter, we see ~24ms p50 and ~59ms p97.5 (p99 ~98ms). From a US→EU path (San Antonio → DE), we observe ~151ms p50 and ~305ms p97.5 (p99 ~375ms). Throughput remains more than adequate for the control plane: EU p50 ~724 req/s (p97.5 ~931 req/s), US→EU p50 ~126 req/s (p97.5 ~137 req/s).
Second, operational clarity. We tightened our telemetry discipline so numbers are trustworthy at a glance. High‑volume streams (like det_seen) can be noisy; we sample these to reduce spurious spikes while preserving all accepted opportunities. We also added a per‑pair TTL dedupe window to prevent event storms from skewing charts and dashboards. Where the live stream is interrupted, the UI clearly labels a simulated mode so there’s no ambiguity for readers or reviewers.
Third, product coherence. The live Polygon feed now lives at a dedicated route: /products/orkid/feeds/polygon. Product overview pages no longer embed a feed, which keeps them focused on explaining the system. We also de‑duplicated navigation and moved the “Dashboard” entry under the ORKID product namespace at /products/orkid/dashboard. This aligns what you click with how teams actually evaluate: product pages for understanding, feed pages for observing, and dashboards for measuring.
Why latency (and clarity) matters for MEV
MEV is a timing game. Our physics‑informed detector (FMD) is designed to surface trades that can actually be realized, not just theoretical spreads on paper. That imposes two requirements.
The detector has to be fast enough to matter. By measuring the full detector processing time in Keen (not a micro‑benchmark), we get a realistic view of cadence and overhead. Recent runs show a median in the 50–168ms band with a conservative ≤200ms cap at the 97.5th percentile. That gives us a budget for venue quotation, route selection, and safety gates while still landing inside practical inclusion windows.
The delivery path has to be not only fast, but predictable. Within the EU datacenter, API calls round‑trip around ~24ms median, which is effectively instantaneous for operators viewing the feed or dashboards. Across the Atlantic, we measure ~151ms median from San Antonio to our DE server. These are measured numbers, not aspirational claims, and they are stable across multiple runs.
We also re‑aligned the UX to match how teams actually evaluate a system like this. Overview pages explain what the engine does and why; dedicated feed pages show the real‑time stream; dashboards aggregate the behavior into actionable metrics. And if the live stream is not available, the UI will clearly mark a simulated mode so nobody mistakes an example for a live fill.
Reliability and observability
We enforce a private‑first routing posture for MEV‑sensitive flows using policy flags (Flashbots Protect preferred), and we simulate before sending. We do not dual‑broadcast. Where policy allows, we can fall back, but this is explicit and recorded.
On the telemetry side, we added:
- Sampling controls for noisy Keen streams (e.g., det_seen) so rate spikes don’t drown signal.
- Per‑pair TTL dedupe to avoid event storms and accidental double‑counts.
- Pulse snapshots that include latency percentiles (including p95/p99 today, and p50/p97.5 next), current mempool rate, and the private/standard route mix so you can see our posture and health at a glance.
We prefer percentiles to averages. p50 represents typical user experience. p97.5 gives a stable view of the near tail without being dominated by single outliers, which makes it more useful for investor‑facing materials. We still track p99 on status/ops dashboards for incident analysis.
Telemetry deep dive (Keen)
Under the hood, we invested in a cleaner, more analyzable telemetry model that makes our charts honest and our decisions faster.
- Canonical event time: every record carries a
keen.timestampso queries roll up by when things happened, not when they arrived. This fixes skew during bursts and network hiccups. - Unified schema: detectors and routes emit the same core fields —
detector,route(private|standard),pair_key,size_in,gross_usdc,net_usdc,spread_bps,gas_used_sim,max_fee_per_gas, and mempool context liketipP90Gwei,mempoolRate, and (where available)sandwichIntensity. - Sampling and dedupe: high‑volume streams (e.g.,
det_seen) are sampled with a clear rate tag; accepted opportunities are never sampled. A per‑pair TTL dedupe window suppresses duplicate events from brief flurries without losing ground truth. - Saved queries and caches: we snapshot common dashboards (latency percentiles, opportunities/min, route mix) with short TTL caches to make investor‑facing charts load instantly while keeping data fresh.
- Private‑route audit metadata: for protect‑first paths we attach route metadata (private vs standard, relay response where available, inclusion latency) without logging secrets or headers. Structured logs are redacted by default.
- Null‑safe semantics: unknowns are
null— never fabricated zeros — so downstream analysis can distinguish “missing” from “zero”.
These changes reduce noise, make week‑over‑week comparisons reliable, and shorten the time from “observation” to “decision.”
Product UX cleanup
We separated concerns to reduce confusion and make diligence easier. The live feed is at /products/orkid/feeds/polygon and is the canonical place to observe the stream. The dashboard lives under the ORKID product at /products/orkid/dashboard, and the “Resources → Dashboard” item points here. Product overview pages do not embed the feed anymore; instead, they link to it. This keeps overviews focused on explaining the system and reduces cognitive load for readers.
We also tightened guardrails around what we display. In the UI, tiny position sizes are guarded so ROI readouts remain meaningful. We avoid double‑subtracting gas and keep an optional positive‑only gating mode available via environment flags for investor‑facing feeds. Contact forms now require corporate email domains, which is a small but important trust signal for early design‑partner conversations.
Methodology (how we measured)
API latency and throughput were measured with autocannon using the following profile: -d 20 -c 20 https://api.cadencesystem.com/health. We ran this from two vantage points: (1) our EU datacenter host, and (2) a MacBook in San Antonio, TX hitting the same DE endpoint. Autocannon samples request and byte counts once per second; the reported percentiles (p50, p97.5, p99) are calculated across the 20‑second window. The /health endpoint is intentionally lightweight and representative of control‑plane latency.
Representative results:
EU DC → api.cadencesystem.com/health
Latency p50 ~24 ms | p97.5 ~59 ms | p99 ~98 ms | avg ~27.7 ms (stdev ~21.6 ms)
Req/Sec p50 ~724 | p97.5 ~931
US→EU (San Antonio → DE) → /health
Latency p50 ~151 ms | p97.5 ~305 ms | p99 ~375 ms | avg ~162.1 ms (stdev ~37.8 ms)
Req/Sec p50 ~126 | p97.5 ~137
Detector processing times come from Keen analytics emitted directly by the detectors during live runs. We report the median band (50–168ms typical) and cap the headline at ≤200ms for the p97.5. This reflects full processing including route selection and safety checks, not a contrived micro‑benchmark.
What’s next
- Expose p50/p97.5 alongside p99 in the public status UI and in the
/pulsepayload so the status page shows a truthful “typical” and “near‑tail” view. - Publish periodic performance snapshots with identical autocannon conditions to make week‑over‑week comparisons straightforward.
- Continue optimizing detector hot paths, quotation strategies across venues, and WS/provider configuration.
Call to action
If you value correctness and scientific rigor, we’re opening a few design‑partner slots. We’ll provision a private dashboard, tailor venue coverage, and deliver weekly action‑ready reports.
→ Get in touch: /contact
Written by Orkid Labs
← Back to blog