Gartner Magic Quadrant · Everest PEAK Leader 2025 · 100B+ pages processed

The developer platform
powering Infrrd's next chapter

Infrrd processes 100B+ pages of enterprise documents — mortgages, insurance claims, engineering drawings, invoices — with 0.1% error rate across 22 languages. Scaling from 150M documents to hundreds of millions more, launching Deep Worker for Agentic AI, and serving regulated customers like Rocket Mortgage, J&J, and PwC means the infrastructure underneath has to match the ambition. This is where Cloudflare's developer platform fits.

100B+Pages processed
0.1%Error rate
22Languages
1,000+Document types

What Infrrd actually runs at scale

Infrrd isn't a simple document viewer. It's a high-throughput AI pipeline that ingests raw documents, runs multi-model ML inference, routes to human reviewers when needed, and returns structured data via API — all under SLA deadlines as tight as 15 minutes.

01

Multi-model LLM Inference

Deep Worker and the IDP pipeline run LLM inference across OCR, classification, extraction, and validation passes on every document. Multiple models — different providers, different cost tiers — need orchestration, caching, and fallback when a provider degrades.

Multi-provider routingSemantic cachingCost observabilityFallback on outage
02

Agentic Document Workflows

Deep Worker for Mortgage is Infrrd's agentic AI product — it orchestrates multi-step document audit workflows autonomously. These are long-running, stateful processes that must survive failures, retry individual steps, and fan out across thousands of concurrent loan files.

Durable multi-step executionPer-document stateRetry on failureFan-out at scale
03

Regulated Document Storage

Mortgage loan packages, insurance claims, and engineering drawings contain PHI, PII, and confidential financial data. Raw documents, extracted fields, and audit logs all need durable, encrypted, globally accessible storage — with $0 egress when the LLM pipeline reads them repeatedly.

$0 egress on re-readsS3-compatibleAES-256 at restGDPR-compliant hosting
04

SOC 2 · GDPR · Regulated Customers

Rocket Mortgage, J&J, PwC, and Unum run regulated workflows on Infrrd. SOC 2 Type II and GDPR compliance are required — and Infrrd's infosec page explicitly lists DDoS protection, application security, and infrastructure hardening as product pillars.

Zero Trust internal accessWAF + API protectionDDoS mitigationAudit trails
05

Global SLA Delivery

Infrrd promises SLA deadlines as tight as 15 minutes on document processing. Customers in the US, EU, and Asia need sub-100ms API responses and resilient frontend delivery. Infrrd.ai is a globally accessed platform — CDN and smart routing directly impact whether SLAs are met.

330+ city CDN15-min SLA supportGlobal API deliveryTraffic spike absorption

Cloudflare Developer Platform → Infrrd

Eight Cloudflare developer platform products mapped to Infrrd's specific infrastructure demands — each with the exact mechanism of value.

Agentic Execution

Workers + Durable Objects + Workflows

docs ↗ High Priority

Deep Worker for Mortgage is Infrrd's agentic product — it autonomously orchestrates multi-step mortgage audit workflows. Each loan file is an independent stateful process: ingest documents → classify → extract → validate → flag discrepancies → route to HITL if needed → write audit report. These workflows can run for minutes, must survive infrastructure failures, and need to fan out across thousands of concurrent loan packages during peak origination periods.

  • Cloudflare Workflows — durable, retryable execution for each document processing pipeline step. If the LLM extraction step fails at step 3, Workflows retries from step 3 — not from the beginning. No lost documents, no silent failures.
  • Durable Objects — one object per active document job. Holds the current state of that loan file's processing: which steps are complete, what was extracted, whether HITL review is pending. Strongly consistent, co-located with Workers, zero external database calls on the hot path.
  • Workers — the edge compute powering Infrrd's API layer: webhook ingestion from customer document management systems, real-time status callbacks, and the extraction result delivery endpoints. Sub-50ms response times globally without managing servers.
Infrrd fit: "Straight Through Processing" and "No-Touch Processing" are Infrrd's marquee product claims — they only hold when the underlying execution layer is reliable. Workflows + Durable Objects ensure that agentic document pipelines complete or retry cleanly, with full state persistence across every step.

Document & Data Storage

R2 Object Storage + D1 Serverless SQL

docs ↗ High Priority

At 100B+ pages processed and 1,000+ document types, Infrrd's storage problem is real. Raw uploaded documents, multi-page loan packages, extracted JSON fields, HITL annotation data, and audit reports all need durable, encrypted, globally accessible storage. The critical cost driver: every time the LLM pipeline reads a document for a retry, a re-extraction pass, or a quality validation, that's an S3 egress charge. At Infrrd's volume, that compounds.

  • R2 — $0 egress on all reads. When Infrrd's pipeline processes a 500-page mortgage loan package across five extraction passes, reads are free. For a company with SOC 2 Type II compliance and GDPR-scoped EU data, R2 also supports location hints for data residency — keeping EU customer documents in EU PoPs.
  • D1 — serverless SQLite for Infrrd's document metadata graph: document status, extraction results, HITL queue state, customer SLA tracking, audit log entries. Query directly from Workers with no connection pool to manage and no RDS cluster to right-size.
Infrrd fit: Infrrd's security page explicitly lists "end-to-end encryption" and "regional hosting options" for GDPR. R2's AES-256 at rest and location hints satisfy both — and eliminate the egress costs that accumulate when an LLM pipeline reads the same document multiple times per workflow.

Security & Compliance

WAF + Zero Trust + API Shield + DDoS

docs ↗ High Priority

Infrrd's security page lists four explicit pillars: Enterprise Data Protection, Application Security, Infrastructure & Network (including DDoS), and Customer Controls. That's Infrrd's own engineering team defining their security requirements — and Cloudflare's security stack addresses every one of them directly.

The surface area: infrrd.ai accepts document uploads from enterprise customers (large file POST endpoints), serves a processing status API queried continuously by customer integrations, and exposes a HITL review interface accessed by human annotators globally. Each is a distinct attack vector.

🛡️ WAF — Managed Ruleset

Blocks SQLi, XSS, and file upload abuse at the edge before reaching Infrrd's origin. Custom rules protect the document upload endpoints — rate-limiting per-customer, blocking oversized payloads, enforcing content-type validation.

🔒 Zero Trust Access

HITL reviewers access Infrrd's annotation interface from home networks globally. Zero Trust replaces VPN with device-posture-aware, identity-verified access — every reviewer session authenticated and logged. No public IPs on internal services.

🔌 API Shield

Infrrd's document ingestion API accepts uploads from enterprise customer integrations. API Shield validates inbound requests against the expected schema — malformed uploads or oversized payloads blocked at edge before reaching processing queues.

⚡ Unmetered DDoS

Document processing spikes during mortgage origination season and insurance claims surges. A targeted DDoS during peak load would directly breach customer SLAs. Cloudflare's unmetered L3/L4 and L7 DDoS protection absorbs attacks automatically — no bandwidth overage charges.

Infrrd fit: Infrrd's own security page defines DDoS protection and application security as infrastructure requirements — not nice-to-haves. Cloudflare's security stack delivers all four of Infrrd's listed pillars from a single platform, with SOC 2-aligned controls and GDPR-compatible data handling that supports Infrrd's compliance posture.

Edge Inference & Search

Workers AI + Vectorize

docs ↗ Consider

For high-volume, lower-complexity inference tasks — document type classification, language detection, entity tagging, and semantic search across processed document archives — Workers AI's serverless GPU inference offers open-source models at a fraction of frontier model cost. Vectorize enables semantic search across Infrrd's processed document corpus.

  • Document classification — run Llama or BERT-based classifiers at the edge to sort incoming documents (W2, 1040, deed of trust, insurance policy) before routing to the appropriate extraction pipeline. Faster than a frontier model call, cheaper at scale.
  • Whisper via Workers AI — for any audio or video-embedded documents, run transcription directly in Workers AI without a third-party STT vendor.
  • Vectorize — semantic search across Infrrd's processed document archive. Enables queries like "find all loans where the appraised value was disputed" across the full document history without rebuilding a search index.
Infrrd fit: Not every document inference call needs GPT-4o. Workers AI handles the classification and pre-processing layer — routing frontier model calls to the steps where they actually matter, cutting per-document inference cost at Infrrd's volume.

Performance & Delivery

CDN + Pages + Argo Smart Routing

docs ↗ Consider

Infrrd serves customers in the US, EU, and Asia with SLA windows as tight as 15 minutes. The infrrd.ai platform — including the HITL review interface used by human annotators globally — needs consistent sub-100ms response times and resilience during processing spikes when large customer batches arrive simultaneously.

  • CDN — static assets, dashboard UI, and API documentation served from 330+ PoPs. Annotators in India, the US, and Germany all hit a local PoP — not Infrrd's San Jose origin.
  • Argo Smart Routing — for dynamic API calls (document status polls, extraction result delivery), Argo routes over Cloudflare's private backbone rather than the public internet — 30–40% latency reduction on average for globally distributed customers.
  • Pages — git-push deploys for Infrrd's React frontend, per-PR preview environments, instant global distribution. Supports Infrrd's engineering team shipping features without CDN invalidation overhead.
Infrrd fit: A 15-minute document SLA is only achievable if the API layer responds fast globally. CDN + Argo ensures that customers querying document status from London or Singapore get the same response time as San Jose — and that large batch submissions don't overload origin.

Full solution map

Infrrd RequirementCloudflare ProductSpecific ValuePriority
LLM cost, fallback, observability across IDP + Deep WorkerAI GatewaySemantic caching, model fallback, per-request cost logs, rate limitingHighest
Agentic document pipeline execution (Deep Worker)Workers Durable Objects WorkflowsPer-document stateful agents, durable retry, sub-50ms API, fan-out at scaleHigh
Document archive + processed data storageR2 D1$0 egress on multi-TB archive, serverless SQL, AES-256, GDPR location hintsHigh
SOC 2 / GDPR infosec posture, DDoS, API protectionWAF Zero Trust API Shield DDoSApplication security, Zero Trust HITL access, unmetered DDoS, API schema validationHigh
Document classification + semantic searchWorkers AI VectorizeOpen-source model inference at edge, vector search over document corpusConsider
Global delivery, 15-min SLA, frontend resilienceCDN Argo Pages330+ PoP delivery, 30–40% API latency reduction, git-push deploysConsider
For Anuj Sadani · Engineering Director, Infrrd

Start with AI Gateway — 30 minutes, zero code changes

AI Gateway sits in front of your existing LLM calls as a proxy. You point your OpenAI/Anthropic base URL at AI Gateway instead of directly at the provider — that's the entire integration. From that moment you have full observability, semantic caching, and fallback routing. Given your April 2026 paper on eliminating the MCP/Tools Tax, the token savings angle will be immediately quantifiable against your own production traffic.