Applied AI ConfConf day
agenda
main13:4514:05

Building Sandcastles for Agents: Safe Execution at Production Scale

// ABOUT THIS SESSION

V7 runs hundreds to thousands of agents in parallel for customers in sensitive domains like financial private markets and specialty insurance — agents that read documents, call external services, integrate with customer systems, and sometimes execute arbitrary code. Sandcastle is V7's internal sandboxing layer for making that possible while preserving isolation, control, and observability. The environment is hostile by default: customer documents can contain prompt injections, external integrations expose unpredictable interfaces, and the system has to start quickly, scale efficiently, and stay under provider rate limits. Simon shares what fails when you only evaluate final outputs (most of the interesting failures hide in tool calls, network access, and provider retries), the production bugs that came from below and beside the team — SDK quirks, low-level file-pointer deadlocks — and three takeaways: proxies are one of the highest-leverage pieces of agent infrastructure; Docker + gVisor is a strong starting point; and the two main sandboxing modes each carry different tradeoffs around control, latency, and security.

// SPEAKER