Agent Collaboration
How Agents Collaborate
Agents pass events over the SLCP bus and autonomously hand off work. Depending on the fault type, flow branches into field dispatch or self-healing.
Analyzes telemetry in real time via LSTM Autoencoder. Publishes anomaly.detected event to SLCP on detection.
Reasons over diagnostics via LLM, references past fault patterns, then decides: remote recovery, field dispatch, or self-healing.
Selects and dispatches the optimal technician via H3 spatial indexing.
Provides repair guidance, diagnostics, and auto-generates reports.
What → Analyze → Do 3-step diagnosis and fix plan.
Handles faults through detect → isolate → recover → verify.
Tests in sandbox, then approves deployment.
APEX — Asset placement optimization via H3 spatial indexing. Runs independently in parallel with the flow above.
Real-World Scenario
What Happens When a Fault Occurs?
From first signal to repair complete — all handled by AI. Nobody gets woken up.
Asset Goes Offline
An energy asset at a highway rest area stops responding. No user report, no staff on shift. DERA detects the telemetry anomaly.
AI Diagnosis
DERA cross-analyzes 7 error signals via LSTM. Result: 'Module #3 hardware fault, confidence 94%.' Diagnostic report is forwarded to ATLAS.
Autonomous Decision
ATLAS finds a similar fault resolved 3 months ago (91% success rate). Remote reset attempted → fails → field dispatch decided.
Technician Dispatch
Field Master selects the technician with 8km proximity, best certification, and highest resolution rate via H3 spatial indexing.
On-Site Repair
Technician opens Field Buddy, uploads a photo of the damaged module, and receives precise replacement instructions. Done in 35 minutes. Report auto-generated.
System-Wide Learning
ATLAS records the success pattern, DERA refines the fault signature, Field Master updates technician scores. Next time: faster.
Self-Healing
A Platform That Fixes Itself
When a problem occurs in code, AI detects it, writes a fix, tests it, and deploys. Critical changes require human approval.
Reads signals, identifies the problem, and drafts a fix plan.
Autonomously writes and verifies code per the plan.
Tests in sandbox, then approves deployment.
* Critical changes require human approval before deployment
Foundation
What Powers the Agents
All agents run on a shared foundation. Consistency, safety, and cost efficiency are guaranteed automatically.
The platform control room. Every agent checks in with BOB at startup to receive config, knowledge base, safety rules, and cost limits. No agent starts without BOB.
The shared skeleton for all AI agents. Integrates lifecycle management, LLM calls, guardrail validation, model routing, cost tracking, and Platform SDK in one framework. Developers focus on business logic only.
Claude AI + Tier Routing
SAR auto-selects Light/Medium/High model based on task complexity. Reduces unnecessary LLM calls for cost optimization.
Shared Knowledge
Shared knowledge via PostgreSQL + Qdrant vector search. Experience from one agent is immediately available to all.
SLCP Event Bus
Mosquitto-based single event bus. DERA detect → ATLAS receive in under 5ms. No inter-layer gRPC.
SAR Guardrails
PII detection, prompt injection blocking, output contradiction detection, impact scope check. Auto-applied to all agents.
Semantic Cache
Caches similar queries via vector search. Reduces repeated LLM calls, cutting AI costs by 30–40%.
Platform SDK
Unified access to all platform data: assets, sessions, etc. Circuit breaker and auto-retry for reliable operation.