
Emmett Fear
Director of Demand Gen · Runpod
GPU cloud for AI workloads
AI Infrastructure// ABOUT THIS SESSION
Deploying AI models for inference has traditionally meant long setup times, Docker complexity, and cold start delays that slow down production. In this live demo, we will show how Runpod Flash changes that entirely. Flash lets you write a Python function, decorate it, and run it on serverless GPU infrastructure instantly, with no Docker required. We will walk through a real deployment from a simple Python script to a live inference endpoint, showcasing how Flash handles dependency management, dramatically reduces cold starts, and scales automatically with demand. Whether you are running LLMs, computer vision models, or custom pipelines, Flash gets you from code to production in minutes.
// SPEAKER