Building a Multi-Agent Healthcare Platform in 6 Phases
The sprint that produced our first vertical agent platform. What we built, what we cut, what we'd do differently — focused on the architectural decisions that held up under real workload.
Technical deep dives, architecture decisions, and lessons learned from building a production AI agent platform.
We tried having AI generate XML workflows. It failed — truncation, escaping issues, hallucinated success. So we flipped the architecture: AI describes WHAT in ~150 tokens of JSON, and the server handles HOW. One API call. 2 seconds. $0.001. Here's the full story of how we got there.
The sprint that produced our first vertical agent platform. What we built, what we cut, what we'd do differently — focused on the architectural decisions that held up under real workload.
Semantic Kernel's AutoInvokeKernelFunctions is a black box. We built ManualToolCallLoop for visibility, guardrails, cost tracking, and per-call trace recording.
Permission checks, cost controls, and PII scanning on every tool call. How we built enterprise-grade safety into the agent execution engine.
The story of how a multi-agent architecture transformed a manual claims-processing bottleneck. Focus on the architectural moves — task decomposition, parallel dispatch, and where the latency actually lives.
4M tokens/min throughput, $0.20/M input pricing, and reliable tool calling. How grok-4-1-fast became our primary production LLM.
Everything in these posts is running in production. Let's talk about building the same for your organization.
Book a Discovery Call