Skip to main content

Requirements

ToolPurpose
wrkHTTP/1.1 load generator
h2loadHTTP/2 load generator (part of nghttp2)
nginxComparison proxy (must be in PATH)
python3Backend server and output parsing
curlReadiness checks
opensslSelf-signed cert generation (H2 only)
Build Arc before running:
cargo build --release -p arc-gateway

HTTP/1.1 benchmark

cd benchmark
bash scripts/run_h1_wrk_vs_nginx.sh

What it does

  1. Starts http_ok_backend.py (a minimal Python HTTP server)
  2. Starts arc-gateway and nginx pointing to the backend
  3. Runs one warmup round, then RUNS measurement rounds of wrk against each
  4. Parses results and writes summary.json and summary.md

Default parameters

VariableDefaultDescription
RUNS5Measurement iterations
THREADS8wrk thread count
CONNECTIONS256Concurrent connections
DURATION30sDuration per measurement run
WARMUP5sWarmup run duration (excluded from results)
ARC_WORKERS1Arc worker thread count
REQUIRE_ZERO_NON2XX1Fail if any non-2xx/3xx responses seen
Override any parameter via environment variable:
RUNS=10 CONNECTIONS=512 ARC_WORKERS=4 bash scripts/run_h1_wrk_vs_nginx.sh

HTTP/2 benchmark

cd benchmark
bash scripts/run_h2_h2load_vs_nginx.sh
Uses h2load instead of wrk. A self-signed RSA-2048 certificate is generated automatically and shared between Arc and Nginx. Both terminate TLS; the backend is plain HTTP/1.1.

Default parameters

VariableDefaultDescription
RUNS5Measurement iterations
REQUESTS20000Total requests per run
CLIENTS64Concurrent H2 clients
STREAMS20Max concurrent streams per connection
THREADS2h2load thread count
WARMUP_REQUESTS1000Requests in the warmup run

Output artifacts

Each run creates a timestamped directory under benchmark/results/:
benchmark/results/h1_wrk_20260302_121530/
  arc.json              Arc config used
  nginx.conf            Nginx config used
  env.txt               Environment snapshot (see below)
  arc_warmup.txt        Warmup output (excluded from results)
  nginx_warmup.txt
  arc_run1.txt          Raw wrk/h2load output, runs 1–N
  ...
  arc_runN.txt
  nginx_runN.txt
  summary.json          Machine-readable aggregated results
  summary.md            Human-readable markdown table
  summary.stdout.json   Copy of parse script stdout
  arc.out.log           Arc stdout
  arc.err.log           Arc stderr
  backend.log           Backend server output
  nginx.error.log       Nginx error log
env.txt fields recorded before each run:
KeyContent
run_idTimestamp-based run identifier
git_commitgit rev-parse HEAD
unameFull uname -a string
wrk_version / h2load_versionTool version string
nginx_versionnginx -v output
arc_binResolved path to arc-gateway binary
params.*All tunable parameters at their resolved values
ports.*All port assignments used in this run

Reading results

summary.json contains per-case aggregated statistics across all runs:
{
  "arc": {
    "requests_per_sec": {
      "mean": 95000.0,
      "median": 95200.0,
      "min": 94000.0,
      "max": 96000.0
    },
    "latency_avg_ms": { "median": 2.7 }
  },
  "nginx": {
    "requests_per_sec": { "median": 78000.0 }
  },
  "compare": {
    "arc_vs_nginx_rps_ratio_median": 1.22,
    "arc_vs_nginx_rps_gap_pct_median": 22.0
  }
}
summary.md contains the same data as a Markdown table for easy sharing. Always use median values for published comparisons. The compare block is included automatically when both arc and nginx cases are present.

Reproducibility checklist

  • Use a fixed machine profile and kernel version (Linux ≥ 6.1 recommended for io_uring multishot)
  • Pin Arc and Nginx build versions; env.txt records the git commit
  • Run at least 5 rounds (RUNS=5) with identical settings
  • Use median values for published comparisons
  • Keep all raw *_runN.txt files alongside any published claim
  • Arc’s data plane requires Linux io_uring; use WSL2 or a native Linux host (not macOS)
  • Disable CPU frequency scaling for consistent results: cpupower frequency-set -g performance

Test backend

benchmark/backends/http_ok_backend.py is a minimal ThreadingHTTPServer that serves a fixed-size response. Accepts any GET, POST, PUT, DELETE, or HEAD.
ArgumentDefaultDescription
--portrequiredListen port
--payload-bytes2Response body size (H1 test uses 2, H2 test uses 4096)
--delay-ms0Per-request sleep (for slow-backend simulation)
--status200Response status code
To test with a realistic backend response size:
PAYLOAD_BYTES=4096 bash scripts/run_h1_wrk_vs_nginx.sh

Troubleshooting

Install the missing tool. On Ubuntu/Debian: apt install wrk for wrk; apt install nghttp2-client for h2load. Verify versions with wrk --version and h2load --version.
The benchmark found non-2xx/3xx responses. This usually means Arc or Nginx is rejecting requests (wrong config, rate limit, or backend not running). Check backend.log and arc.err.log in the output directory. Disable the check with REQUIRE_ZERO_NON2XX=0 only when debugging.
High variance is common when CPU frequency scaling is enabled. Disable it with:
cpupower frequency-set -g performance
Also check for background processes consuming CPU. Use the median (not mean) from summary.json for published comparisons.
The H2 benchmark generates a self-signed certificate. If h2load rejects it, check which TLS skip flag your h2load version uses. The script auto-detects --insecure, --no-verify-peer, or -k. If none match, update the script’s TLS flag detection.
The script allocates ports at random from a safe range. If a collision occurs, kill the conflicting process or re-run — a new random port will be chosen.