Case Study

FGP: Making AI Agent Tools 19x Faster

MCP stdio spawns a new process for every tool call. FGP replaces that with persistent daemons over UNIX sockets, cutting latency from seconds to milliseconds.

19× fasterShipped ProductsLibraries
FGP (Fast Gateway Protocol)

The Problem: Death by Cold Start

AI agents make sequential tool calls. Each call should feel instant. Instead, MCP stdio adds ~2.3 seconds of cold-start overhead per call.

For a simple 4-step workflow (navigate → snapshot → click → snapshot), that's 11+ seconds of waiting. The agent feels sluggish. Users lose trust.

The math is brutal:

Agent WorkflowTool CallsMCP OverheadTime Lost
Check email24.6s4.6s
Browse + fill form511.5s11.4s
Full productivity check1023s22.9s
Complex agent task2046s45.8s

The problem isn't the tools themselves—it's the protocol overhead.

The Insight: Keep It Warm

MCP stdio spawns a fresh process for each tool invocation. This gives you isolation but kills latency.

The insight: most agent tools don't need per-call isolation. A browser session should stay warm. A Gmail connection should stay authenticated. A database connection should stay pooled.

FGP flips the model:

MCP stdio (cold):    Agent → spawn process → load deps → connect → execute → return → die
FGP daemon (warm):   Agent → send message → execute → return (process stays alive)

Architecture

FGP uses persistent daemons connected via UNIX sockets:

┌─────────────────────────────────────────────────────────┐
│                     AI Agent / Claude                    │
├─────────────────────────────────────────────────────────┤
│                   FGP UNIX Sockets                       │
│   ~/.fgp/services/{browser,gmail,calendar,github}/      │
├──────────┬──────────┬──────────┬──────────┬────────────┤
│ Browser  │  Gmail   │ Calendar │  GitHub  │   ...      │
│ Daemon   │  Daemon  │  Daemon  │  Daemon  │            │
│ (Rust)   │  (PyO3)  │  (PyO3)  │  (Rust)  │            │
├──────────┴──────────┴──────────┴──────────┴────────────┤
│           Chrome    │  Google APIs  │  gh CLI          │
└─────────────────────────────────────────────────────────┘

Key design decisions:

  1. UNIX sockets — Zero network overhead, file-based permissions, no TCP handshake
  2. NDJSON protocol — Human-readable, streaming-friendly, easy to debug
  3. Per-service daemons — Independent scaling, fault isolation, simple upgrades
  4. Rust core — Sub-millisecond protocol overhead, ~10MB memory per daemon

Benchmark Results

Browser Automation (vs Playwright MCP)

The browser daemon connects to Chrome via DevTools Protocol and keeps the connection warm:

OperationFGP BrowserPlaywright MCPSpeedup
Navigate8ms2,328ms292x
Snapshot9ms2,484ms276x
Screenshot30ms1,635ms54x

Multi-Step Workflow

The real test: a 4-step workflow (navigate → snapshot → click → snapshot):

ToolTotal Timevs MCP
FGP Browser585ms19x faster
Vercel agent-browser733ms15x faster
Playwright MCP11,211msbaseline

API Daemons

All methods tested at 100% success rate:

Gmail Daemon (PyO3 + Google API):

MethodMeanPayload
inbox881ms2.4KB
search748ms2.4KB
thread116ms795B

Calendar Daemon (PyO3 + Google API):

MethodMeanPayload
today315ms48B
upcoming241ms444B
search177ms46B
free_slots198ms65B

GitHub Daemon (Native Rust + gh CLI):

MethodMeanPayload
repos569ms2.8KB
notifications521ms9.8KB
issues390ms75B

Key insight: Latency is dominated by external API calls (~100-900ms), not FGP overhead (~5-10ms). For MCP, add ~2.3s cold-start to every call.

The Protocol

All daemons speak the same NDJSON-over-UNIX-socket protocol:

Request:

{"id": "uuid", "v": 1, "method": "browser.navigate", "params": {"url": "..."}}

Response:

{"id": "uuid", "ok": true, "result": {...}, "meta": {"server_ms": 8.2}}

Built-in methods every daemon supports:

  • health — Check daemon health
  • methods — List available methods
  • stop — Graceful shutdown

Building New Daemons

The SDK makes it easy to add new services:

use fgp_daemon::{FgpServer, FgpService};

struct MyService { /* state */ }

impl FgpService for MyService {
    fn name(&self) -> &str { "my-service" }
    fn version(&self) -> &str { "1.0.0" }

    fn dispatch(&self, method: &str, params: Value) -> Result<Value> {
        match method {
            "my-service.hello" => Ok(json!({"message": "Hello!"})),
            _ => bail!("Unknown method"),
        }
    }
}

fn main() {
    let server = FgpServer::new(MyService::new(), "~/.fgp/services/my-service/daemon.sock")?;
    server.serve()?;
}

Why This Matters

Agent tooling is at an inflection point. LLMs can orchestrate complex workflows, but latency breaks the illusion.

When tools feel instant:

  • Users trust the agent
  • Agents can make more calls without timeout pressure
  • Complex workflows become practical

FGP isn't about raw speed—it's about moving from "noticeable delay" to "instant response" in the user perception tier.

Status

ComponentStatusPerformance
browserProduction8ms navigate, 9ms snapshot
gmailBeta116ms thread, 881ms inbox
calendarBeta177ms search, 233ms avg
githubBeta390ms issues, 474ms avg
daemon SDKStableCore library
cliWIPDaemon management

Takeaways

  1. Cold-start overhead compounds — Each tool call paying 2.3s adds up fast in agent workflows
  2. Daemons beat processes for high-frequency, stateful tools
  3. UNIX sockets are underrated — Zero network overhead, file-based security, simple debugging
  4. The "real work" is often fast — It's the protocol overhead that kills UX
  5. User perception tiers matter — Moving from 2s to 10ms isn't just "faster," it's a different experience class

Interested in working together?

Get in touch →
View all projects