Local pilot
Walks you from a clean checkout to an end-to-end Coledex backup running on a real provider host (Kuma), settling onchain on Solana devnet, and depositing the encrypted dump in local Minio. A second provider runs as a container on the control plane to exercise multi-host scheduling — no third physical machine required.
Target audience: operator (you), not a developer reading the codebase for the first time.
If anything in this document is out of date relative to
deploy/local-pilot/ scripts, the scripts win. They are the
machine-checked source of truth.
0. Topology
Section titled “0. Topology” LAN (e.g. 192.168.0.0/24) | +-----------------------+----+ | |mp1 = this dev box mp2 = kuma (ZimaOS)control plane real provider+ simulated provider - openinfra-agent (systemd)- Postgres (5433) - Docker (containers)- api-server (8082, 9092)- registry (5000)- minio (9000/9001)- prometheus(9095)- grafana (3000)- customer-pg (5434)- agent-mp3-sim ← second provider as a compose container, (mounted on the spawns workloads on this box's docker.sock host's docker.sock)Both boxes must:
- Reach each other on the LAN.
- Resolve
mp1’s IP (you’ll set it explicitly — no mDNS dependency). - Run Linux + Docker (Kuma already does).
Why a simulated mp3 is OK for the pilot: it lets us exercise
multi-host invite issuance, pinned-host scheduling, and SPL
distribution across two payout wallets — all the multi-host code
paths — without a third box. The simulator shares the control
plane’s docker daemon (mounted socket), so workloads it spawns
appear as siblings to the api-server itself. Production trivially
swaps the sim for a real third machine: set USE_REAL_MP3=1 and
SSH the install script. The sim service in compose is harmless to
leave running alongside.
1. Preflight
Section titled “1. Preflight”On mp1:
# Toolchain — already on this boxgo version # 1.25+docker --versiondocker compose versionjq --versioncurl --version
# Get your LAN IP (will go everywhere as CONTROL_PLANE_IP)ip route get 1.1.1.1 | awk '{for(i=1;i<=NF;i++) if($i=="src") print $(i+1)}'On kuma:
ssh kuma 'docker --version && sudo systemctl is-active docker'(mp3 is the compose-managed simulator on mp1 — no separate preflight needed.)
Solana payout wallets — create two if you don’t have them:
# Once per provider, on any machine with solana-keygen:solana-keygen new -o kuma-payout.json --no-bip39-passphrasesolana-keygen new -o mp3-payout.json --no-bip39-passphrase
# Note the public keys — you'll need them for the pilot script:solana-keygen pubkey kuma-payout.json # → export KUMA_PAYOUT_PUBKEY=...solana-keygen pubkey mp3-payout.json # → export MP3_PAYOUT_PUBKEY=...2. Bring up the control plane (mp1)
Section titled “2. Bring up the control plane (mp1)”cd ~/Documents/seppelabs/openinfra/deploy/local-pilot
# CONTROL_PLANE_IP is auto-detected; override if your box has VPN# interfaces and the auto-detect picks the wrong one../control-plane-up.shWhat this does:
- Builds the
openinfra-apiDocker image from local source. docker compose up -dbrings up: Postgres, customer-pg, api-server, registry, Minio (+ bootstrap container that creates the bucket), Prometheus, Grafana.- Waits for
/healthzon the api-server. - Greps the api-server logs for
TREASURY_PUBKEY=…(printed once on first boot when the onchain executor generates the keypair). - Hits the Solana devnet faucet for 1 SOL into the treasury, only if the treasury balance is under 0.1 SOL (don’t burn the faucet on every re-run).
- Prints the URL summary + next-step hints.
One-time mint (the script tells you the exact command):
# Use the existing OINFRA-test mint from .devnet-keys/ OR mint into it:spl-token mint 9Jkq8WdgUUp2AR4FeXwE6q4DRddoHGfcKREMeCE6wphT 1000000 \ --owner $TREASURY_PUBKEY --url devnetIf you don’t have spl-token locally:
docker run --rm -it \ -v ~/.config/solana:/root/.config/solana \ solanalabs/solana:latest \ spl-token mint ... --owner $TREASURY_PUBKEY --url devnet3. Install the agent on providers
Section titled “3. Install the agent on providers”From mp1, for the real provider only:
./provider-install.sh kuma <CONTROL_PLANE_IP> amd64What it does:
- Cross-compiles the agent for
linux/<arch>. scps the binary to the host.- Installs as
/usr/local/bin/openinfra-agent. - Configures Docker to trust the LAN registry at
<CONTROL_PLANE_IP>:5000(insecure HTTP — fine on LAN). - Writes a systemd unit and enables it (does not start —
needs
/etc/openinfra/agent.envwhichrun-pilot.shwrites).
The simulated mp3 is already running — control-plane-up.sh built
its image and started the container as part of docker compose up.
Verify with: docker compose ps agent-mp3-sim.
If/when you get a third physical machine, run
./provider-install.sh mp3 <CONTROL_PLANE_IP> and set
USE_REAL_MP3=1 before running the pilot — the sim then stays
idle (no invite registered against it).
4. Run the pilot
Section titled “4. Run the pilot”From mp1:
export CONTROL_PLANE_IP=192.168.0.x # your LAN IPexport KUMA_PAYOUT_PUBKEY=... # from §1export MP3_PAYOUT_PUBKEY=...
# Generate the customer-side backup recipient (Coledex's side in real life):age-keygen -o backup-customer.keyexport BACKUP_AGE_RECIPIENT=$(age-keygen -y backup-customer.key)echo "$BACKUP_AGE_RECIPIENT" # age1...
./run-pilot.shThe script walks 8 numbered steps and bails out with a clear error if any step diverges. Expect ~10–15 min end-to-end including the executor’s 5-min onchain settlement cadence.
Success looks like:
================================================================pilot end-to-end: PASS================================================================workload_id: <uuid>status: succeededonchain_settled: 75 (or whatever credits the workload billed)backup_objects: 1================================================================5. Verify with your own eyes
Section titled “5. Verify with your own eyes”| What | How |
|---|---|
| Backup file | http://mp1:9001 → coledex-backups → see one *.pg.age file |
| Decrypt | mc cp local/coledex-backups/<file> /tmp && age --decrypt -i backup-customer.key /tmp/<file> > dump.pg && pg_restore --list dump.pg |
| Kuma’s onchain balance | spl-token accounts --owner $KUMA_PAYOUT_PUBKEY --url devnet |
| Settlement metrics | http://mp1:3000 → OpenInfra/Local Pilot dashboard |
| Raw logs | docker compose logs -f api-server on mp1; journalctl -u openinfra-agent -f on kuma |
6. Common troubleshooting
Section titled “6. Common troubleshooting”| Symptom | Fix |
|---|---|
| Sim provider stuck restarting | docker compose logs agent-mp3-sim — usually INVITE_TOKEN missing or already consumed. Re-run ./run-pilot.sh to issue a fresh invite. |
| Workloads on sim can’t reach customer-pg / minio | They use CONTROL_PLANE_IP in env, not compose service names. Confirm your LAN IP is set correctly in run-pilot.sh. |
TREASURY_PUBKEY= missing | SOLANA_MINT env was empty — set in docker-compose.yml, recreate api-server: docker compose up -d --force-recreate api-server |
| Airdrop fails (429) | Devnet faucet rate-limit — wait 8 h or manually fund via solana airdrop 1 $TREASURY_PUBKEY --url devnet from another IP |
| Agent can’t pull image | /etc/docker/daemon.json insecure-registry entry missing on provider — re-run provider-install.sh |
Workload stuck scheduling | Agent not heartbeating — check journalctl -u openinfra-agent on the provider, common cause: /etc/openinfra/agent.env malformed |
| Onchain never settles | Treasury has 0 SOL (can’t pay fees) or 0 OINFRA (no token to send) — verify with solana balance and spl-token accounts |
pilot end-to-end: PASS then nothing in Minio | Workload exited fast but rclone rcat failed silently — check docker logs openinfra-<workload-id> on kuma |
7. Soak test (Phase 6 prerequisite)
Section titled “7. Soak test (Phase 6 prerequisite)”Once the pilot passes once, leave it running for 7 days:
# The catalog manifest schedules at 0 3 * * * UTC. Let the cron# trigger naturally, or force a run every hour for faster signal:while true; do ./run-pilot.sh && sleep 3600; doneWatch openinfra_settlement_onchain_failed_total on the Grafana
dashboard — any non-zero value during the soak is a real bug to fix
before mainnet. Backup file count in Minio should match number of
successful runs.
8. Tear down
Section titled “8. Tear down”cd ~/Documents/seppelabs/openinfra/deploy/local-pilotdocker compose down -v # -v wipes volumes (treasury, ledger, backups)ssh kuma 'sudo systemctl disable --now openinfra-agent && sudo rm /etc/systemd/system/openinfra-agent.service && sudo systemctl daemon-reload'# Simulated mp3 is part of the compose stack — `down -v` above# already removed it; nothing to clean up remotely.docker compose down (without -v) keeps the volumes so a re-up
resumes with the same treasury keypair + ledger state.