Remediator Agent
Overview
The Remediator Agent is an autonomous AI assistant that runs inside your Kubernetes clusters, working 24/7 on behalf of platform engineering teams. Unlike traditional monitoring tools that simply alert on issues, the Remediator Agent actively analyzes problems, generates solutions, and can automatically remediate policy violations—all while integrating seamlessly with your existing GitOps workflows.
Key Features
- Violation Detection – Monitors for policy violations reported by Kyverno across all clusters
- Guided Remediation – AI generates secure, policy-compliant fixes with clear explanations
- Automated Actions – Creates Pull Requests to update resources in Git repositories, enabling GitOps-friendly remediation
- Multi-Cluster Support – Manage policy compliance across hundreds of clusters through ArgoCD
- Audit Trail – Complete logging for compliance reporting and troubleshooting
Benefits
- Automated detection of policy violations across clusters
- AI-generated fixes with clear explanations
- Git workflow integration - no backdoor changes
- Scheduled operation - runs automatically
- Reduces resolution times from days to minutes
- Scales governance without adding headcount
Core Concepts
Custom Resources
The Remediator Agent uses three Kubernetes Custom Resources for configuration:
- Remediator: Main configuration defining what to scan, when to run, and what actions to take
- LLMConfig: AI provider settings (defaults to Nirmata AI, supports AWS Bedrock and Azure OpenAI)
- ToolConfig: Integration settings for Git providers (GitHub, GitLab)
Environment Types
How the agent discovers what to scan:
- Hub Mode: Uses ArgoCD to manage multiple clusters from a central location
- Local Mode: Scans the same cluster where the agent is installed
Targets
What the agent monitors for violations:
- Clusters: Specific Kubernetes clusters by name or server URL
- Applications: ArgoCD applications you want to monitor
- Namespaces: Specific namespaces within clusters
Actions
What the agent does when it finds violations:
- Create Pull Request: Opens a PR in your Git repository with the fix
- Dry Run: Shows what would be changed without making any modifications
Policy Violations
Security, compliance, or configuration problems in your Kubernetes resources detected by Kyverno. Examples include missing resource limits, incorrect security settings, or outdated configurations. The agent processes Kyverno ClusterPolicyReports with fail
status results.
Schedules & Triggers
When the agent runs:
- Cron Schedule: Set specific times (like every 6 hours)
- Manual Trigger: Run on-demand through the Kubernetes API
Getting Started
The Remediator Agent automatically identifies and fixes policy violations in your Kubernetes clusters and Git repositories using AI-powered remediation. This guide will get you up and running quickly.
Prerequisites
Before installing the Remediator Agent, ensure you have:
Required Components
- Kubernetes Cluster: Running Kubernetes 1.20+
- Helm: Helm 3.x installed and configured
- kubectl: Configured to access your cluster
- ArgoCD (optional): ArgoCD installed (for hub-spoke setups)
Authentication Requirements
- Nirmata API Token: Your personal NCH token. If you don’t have an account, sign up for a 15-day free trial to get your API token.
Quick Installation
Create Namespace and Secrets
# Create namespace
kubectl create namespace nirmata
# Create Nirmata API token secret
kubectl create secret generic nirmata-api-token \
--from-literal=api-token=YOUR_NIRMATA_API_TOKEN \
--namespace nirmata
Install the Remediator Agent
Add and update Helm repo:
helm repo add nirmata https://nirmata.github.io/kyverno-charts
helm repo update nirmata
Install the Helm chart:
helm install remediator nirmata/remediator-agent --devel \
--namespace nirmata \
--create-namespace \
--set nirmata.apiTokenSecret="nirmata-api-token"
Configuration
1. Setup ToolConfig
The ToolConfig defines how the agent connects to your Git provider.
For GitHub using Personal Access Token:
# Create secret
kubectl create secret generic github-pat-token \
--from-literal=token=GITHUB_PAT_TOKEN \
--namespace nirmata
# Create ToolConfig
kubectl apply -f - <<EOF
apiVersion: serviceagents.nirmata.io/v1alpha1
kind: ToolConfig
metadata:
name: toolconfig-sample
namespace: nirmata
spec:
type: github
credentials:
method: pat
pat:
tokenSecretRef:
name: github-pat-token
namespace: nirmata
key: token
defaults:
git:
pullRequests:
branchPrefix: "remediation-"
titleTemplate: "[Auto-Remediation] Fix policy violations: "
commitMessageTemplate: "Auto-fix: Remediate policy violations: "
EOF
For GitHub using App (Recommended):
- Install the Nirmata GitHub App in your organization
- Contact Nirmata support for the private key
# Create secret with private key
kubectl create secret generic github-app-secret \
--from-file=private-key.pem="/path/to/pem/file" \
--namespace=nirmata
# Create ToolConfig
kubectl apply -f - <<EOF
apiVersion: serviceagents.nirmata.io/v1alpha1
kind: ToolConfig
metadata:
name: toolconfig-sample
namespace: nirmata
spec:
type: github
credentials:
method: app
app:
appId: APP_ID
privateKeySecretRef:
name: github-app-secret
namespace: nirmata
key: private-key.pem
EOF
For GitLab:
# Create secret
kubectl create secret generic gitlab-pat-token \
--from-literal=token=GITLAB_PAT_TOKEN \
--namespace=nirmata
# Create ToolConfig
kubectl apply -f - <<EOF
apiVersion: serviceagents.nirmata.io/v1alpha1
kind: ToolConfig
metadata:
name: toolconfig-sample
namespace: nirmata
spec:
type: gitlab
credentials:
method: pat
pat:
secretRef:
name: gitlab-pat-token
namespace: nirmata
key: token
EOF
2. Setup LLMConfig
Using Nirmata AI (Default & Recommended):
The Helm chart automatically creates the LLMConfig when you provide the nirmata-api-token
secret. No additional configuration needed!
If you need to create it manually:
kubectl apply -f - <<EOF
apiVersion: serviceagents.nirmata.io/v1alpha1
kind: LLMConfig
metadata:
name: remediator-agent-llm
namespace: nirmata
spec:
type: nirmataAI
nirmataAI:
endpoint: https://nirmata.io
model: "" # Optional: specify a model, otherwise uses default
apiKeySecretRef:
name: nirmata-api-token
key: api-token
namespace: nirmata
EOF
Using AWS Bedrock (Alternative):
For EKS clusters with Pod Identity Agent:
# Create IAM role and policy (see full AWS setup below)
# Then create LLMConfig
kubectl apply -f - <<EOF
apiVersion: serviceagents.nirmata.io/v1alpha1
kind: LLMConfig
metadata:
name: remediator-agent-llm
namespace: nirmata
spec:
type: bedrock
bedrock:
model: MODEL_ARN_OR_INFERENCE_ARN
region: AWS_REGION
EOF
Full AWS Bedrock Setup Instructions
# Create IAM role
aws iam create-role \
--role-name remediator-agent-role \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": "pods.eks.amazonaws.com" },
"Action": [ "sts:AssumeRole", "sts:TagSession" ]
}
]
}'
# Attach Bedrock permissions
aws iam put-role-policy \
--role-name remediator-agent-role \
--policy-name BedrockInvokePolicy \
--policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BedrockInvoke",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": "arn:aws:bedrock:<AWS_REGION>:<AWS_ACCOUNT_ID>:application-inference-profile/<PROFILE>"
}
]
}'
# Create Pod Identity association
aws eks create-pod-identity-association \
--cluster-name <CLUSTER_NAME> \
--namespace nirmata \
--service-account remediator-agent \
--role-arn arn:aws:iam::<ACCOUNT_ID>:role/remediator-agent-role
Using Azure OpenAI (Alternative):
# Create secret
kubectl create secret generic azure-openai-credentials \
--from-literal=api-key=AZURE_API_KEY \
-n nirmata
# Create LLMConfig
kubectl apply -f - <<EOF
apiVersion: serviceagents.nirmata.io/v1alpha1
kind: LLMConfig
metadata:
name: remediator-agent-llm
namespace: nirmata
spec:
type: azure-openai
azureOpenAI:
endpoint: https://YOUR_RESOURCE_NAME.openai.azure.com/
deploymentName: DEPLOYMENT_NAME
apiKeySecretRef:
name: azure-openai-api-key
key: api-key
namespace: nirmata
EOF
3. Setup Remediator
For ArgoCD Hub Mode (Multi-Cluster):
kubectl apply -f - <<EOF
apiVersion: serviceagents.nirmata.io/v1alpha1
kind: Remediator
metadata:
name: remediator-argo-hub
namespace: nirmata
spec:
environment:
type: argoHub
target:
argoHubTarget:
argoAppSelector:
allApps: true
remediation:
triggers:
- schedule:
crontab: "0 */6 * * *"
llmConfigRef:
name: remediator-agent-llm
namespace: nirmata
gitCredentials:
name: toolconfig-sample
namespace: nirmata
actions:
- type: CreatePR
toolRef:
name: toolconfig-sample
namespace: nirmata
EOF
For Local Cluster Mode:
First, create a ConfigMap mapping repositories to namespaces:
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: repo-namespace-mapping
namespace: nirmata
data:
mapping: |
[
{
"repo": "https://github.com/your-org/your-repo",
"branch": "main",
"path": "k8s/",
"targetNamespace": "default"
}
]
EOF
Then create the Remediator:
kubectl apply -f - <<EOF
apiVersion: serviceagents.nirmata.io/v1alpha1
kind: Remediator
metadata:
name: remediator-local-cluster
namespace: nirmata
spec:
environment:
type: localCluster
target:
localCluster:
repoNamespaceMappingRef:
name: repo-namespace-mapping
namespace: nirmata
key: mapping
remediation:
triggers:
- schedule:
crontab: "0 */6 * * *"
llmConfigRef:
name: remediator-agent-llm
namespace: nirmata
gitCredentials:
name: toolconfig-sample
namespace: nirmata
actions:
- type: CreatePR
toolRef:
name: toolconfig-sample
namespace: nirmata
EOF
Verify Installation
# Check if pods are running
kubectl get pods -n nirmata -l app.kubernetes.io/name=remediator-agent
# Check custom resources
kubectl get llmconfigs,toolconfigs,remediators -n nirmata
# Check logs
kubectl logs -n nirmata -l app.kubernetes.io/name=remediator-agent --tail=50
Advanced Configuration
Target Specific Clusters:
target:
argoHubTarget:
clusterNames:
- production-cluster
- staging-cluster
clusterServerUrls:
- "https://prod.example.com"
argoAppSelector:
allApps: true
Target Specific Applications:
target:
argoHubTarget:
argoAppSelector:
names:
- nginx-demo
- web-app
labelSelector:
matchLabels:
team: platform
environment: production
Filter by Policy Severity:
remediation:
filters:
policySelector:
matchSeverity:
- high
- critical
Observability
The Remediator Agent exposes Prometheus metrics for monitoring and troubleshooting.
Available Metrics
- remediator_reconciles_total (counter) — labels: result=“success|error”
- remediator_reconcile_duration_seconds (histogram) — labels: result=“success|error”
Quick Setup
Enable Service Monitor:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: go-agent-remediator-metrics
namespace: go-agent-remediator-system
spec:
selector:
matchLabels:
control-plane: controller-manager
endpoints:
- port: https
path: /metrics
scheme: https
tlsConfig:
insecureSkipVerify: true
View Metrics:
# Port-forward to metrics endpoint
kubectl -n go-agent-remediator-system port-forward deploy/go-agent-remediator-controller-manager 8443:8443
# Get token and view metrics
SA=go-agent-remediator-controller-manager
NS=go-agent-remediator-system
TOKEN=$(kubectl -n $NS create token $SA)
curl -k -H "Authorization: Bearer $TOKEN" https://localhost:8443/metrics
Example Prometheus Queries
# Success rate
sum(rate(remediator_reconciles_total{result="success"}[1h]))
/
sum(rate(remediator_reconciles_total[1h]))
# P95 latency
histogram_quantile(0.95,
sum by (le) (rate(remediator_reconcile_duration_seconds_bucket[1h]))
)
Support Matrix
- Kubernetes: All CNCF compliant distributions (v1.20+), including vanilla K8s and on-prem
- AI Providers: Nirmata AI (default), AWS Bedrock, Azure OpenAI
- GitOps: ArgoCD
- VCS: GitHub (App & PAT), GitLab (Enterprise & SaaS)
- Manifests: YAML files, simple Helm charts
Common Use Cases
- Policy Compliance Automation: Automatically fix security policy violations across your clusters
- GitOps Integration: Generate pull requests with fixes that integrate with your GitOps workflows
- Multi-Cluster Management: Manage policy compliance across multiple clusters from a central hub
- Continuous Compliance: Achieve continuous compliance instead of point-in-time checks
Uninstallation
helm uninstall remediator -n nirmata
Note: This removes the deployment and CRDs but preserves any secrets you have created. They need to be cleaned up manually.
Value Proposition
By deploying the Remediator Agent, platform engineering teams can:
- Operate at Scale: Manage 10x more clusters without proportional team growth
- Reduce Toil: Eliminate 80% of repetitive compliance and governance tasks
- Improve Compliance: Achieve continuous compliance instead of point-in-time checks
- Faster Remediation: Reduce resolution times from days to minutes
- Better Security Posture: Catch and fix vulnerabilities before they’re exploited
- Empower Developers: Provide fast feedback and automated fixes to development teams
The Remediator Agent doesn’t replace platform engineers—it amplifies their capabilities, allowing them to focus on innovation and strategic work while ensuring operational excellence at scale.