simian-agent

An autonomous chaos-engineering agent for Kubernetes. Provision an arena, deploy a System Under Test, and either describe a fault in plain English or let an LLM run the planning loop against your live cluster topology.

Get started Source on GitHub

simian-agent ships directed-mode chaos (plain-text intent → LLM-translated FaultManifest), an autonomous planning loop (health gate → topology snapshot → LLM-generated AttackPlan → bounded execution), three chaos engines (chaos-mesh, network-policy, envoy-fault), and a safety stage at the executor chokepoint that enforces eligibility, blast-radius tier, duration, and concurrency caps on every fault before it lands.

Safe by default

Every fault flows through one chokepoint — the Fault Executor — which checks namespace eligibility, blast-radius tier, duration ceiling, and concurrency budget before any chaos driver runs. Cap violations surface as executor.rejected with a structured reason, not silent acceptance.

Read more

Autonomous or directed

simian chaos --intent "..." for plain-text directed faults. simian serve --autonomous for the LLM-driven planning loop with per-cycle budget caps, baseline health-gating, and clean LLM-down skip behavior.

Read more

Works on GKE Dataplane V2

The network-policy engine handles partitions and the envoy-fault engine handles HTTP-layer delay + abort on clusters where Chaos Mesh’s NetworkChaos is silently bypassed by the eBPF dataplane.

Read more

Install

# Build the binary
make all

# One-shot: create an arena, deploy Online Boutique, capture baseline
bin/simian sut deploy --namespace boutique-1 --create-arena

See Getting started for the first chaos fault, Deploying with Helm for the in-cluster install, or jump to the Design doc for the architecture.