Stop your agent before it errs.
Prompt injection defense + counterparty risk checks + kill switch.
Pre-action verification middleware for autonomous agents. Detects prompt injection attempts, validates counterparty risk via iAgentFi, monitors behavior anomalies against baseline, and provides a kill switch when something goes wrong.
iAgentSafe will provide four primitives
Prompt injection detection
Pattern library plus LLM-classifier on incoming instructions. Detect jailbreaks, context poisoning, indirect injection via tool outputs.
Counterparty risk policies
Integrate iAgentFi ratings. Refuse actions with counterparties below policy threshold. Configurable per action type.
Behavior baseline
Learn normal agent behavior via iAgentLog. Alert on anomalies: unexpected tool calls, unusual spending, off-goal actions.
Remote kill switch
One API call halts all actions for an agent. Emergency stop for supervised autonomous systems.
The gap iAgentSafe fills
Libraries exist; integrated platform doesn't
Guardrails AI, NVIDIA NeMo, Lakera are all library-level. iAgentSafe will ship as the integrated service with trust signals + behavior baselines + real-time enforcement (Cluster 2 roadmap 2027).
Most attacks happen at runtime
Prompt injection, indirect injection, context manipulation: these happen during agent operation, not at deploy time. Runtime guardrails are the only defense.
Open-core model
Core SDK open-source under MIT. Hosted service with integrations, managed policies, and incident response is paid.
Planned endpoints in Roadmap 2027 (Cluster 2)
Preview of the planned API surface. OpenAPI 3.1 specification at /.well-known/openapi.yaml. Endpoints at api.iagentsafe.com will serve requests at roadmap 2027 (cluster 2); agent-consumable JSON by design.
POST /v1/check |
Pre-action verification: allow, deny, or review |
POST /v1/policy |
Configure runtime policies |
POST /v1/killswitch/{agent_id} |
Emergency halt for a specific agent |
GET /v1/incidents/{agent_id} |
Security incident history |
iAgentSafe is one layer
Sixteen products. One stack. One entity. Trust, discovery, observability, payments, safety, simulation, composition, memory, identity, legal, markets, and owned compute underneath. Each layer reinforces the others. Use one or use them all.