AI Analysis
Leverage AI-powered insights to understand and resolve infrastructure drift.
Overview
When drift is detected, Controlinfra uses AI (Claude or GPT-4) to analyze each change and provide:
- Root Cause Analysis: Why the drift occurred
- Impact Assessment: What systems are affected
- Blast Radius: Scope of potential issues
- Severity Scoring: How urgent is the fix
- Fix Recommendations: Code to resolve the drift
- Prevention Tips: How to avoid future drift
How It Works
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Drift │ │ AI │ │ Analysis │
│ Detected │────▶│ Model │────▶│ Report │
│ (changes) │ │ (Claude/GPT)│ │ (insights) │
└─────────────┘ └─────────────┘ └─────────────┘The AI receives:
- The Terraform resource configuration
- The detected changes (diff)
- Resource type and context
- Previous drift patterns (if any)
Analysis Components
Root Cause Analysis
The AI examines the drift and identifies likely causes:
Root Cause Analysis
───────────────────
The AWS security group "web-sg" was modified directly in the
AWS console to add an inbound rule allowing SSH (port 22)
from 0.0.0.0/0.
Likely scenario: A developer needed emergency SSH access for
debugging and added the rule manually rather than through
Terraform. This is a common pattern when quick access is
needed but proper change management wasn't followed.
Evidence:
• The change adds a permissive SSH rule
• The rule description contains "temp" suggesting temporary access
• Similar patterns have occurred in this repository beforeImpact Assessment
Understanding what systems and users are affected:
Impact Assessment
─────────────────
Direct Impact:
• 3 EC2 instances in the web-tier security group
• Production traffic (12,000 requests/hour) flows through these
Secondary Impact:
• Application servers become reachable from the internet
• SSH brute force attacks become possible
• Compliance audit may flag this as a violation
Business Impact:
• PCI-DSS compliance at risk
• Potential security incident if exploited
• May trigger security team alertsBlast Radius
Visualizing the scope of affected resources:
Blast Radius
────────────
Security Group: web-sg
├── EC2: web-server-1 (i-abc123)
├── EC2: web-server-2 (i-def456)
├── EC2: web-server-3 (i-ghi789)
└── ALB: web-load-balancer
Downstream dependencies:
├── RDS: main-database (accessible from web tier)
└── ElastiCache: session-store
Total affected resources: 7Severity Explanation
Why the AI assigned a particular severity:
Severity: CRITICAL 🔴
────────────────────
This drift is classified as CRITICAL because:
1. Security Exposure (Primary Factor)
- Port 22 (SSH) is exposed to 0.0.0.0/0
- This allows access from any IP address globally
2. Resource Type (Security Group)
- Security groups are security-critical resources
- Changes have immediate network-level impact
3. Production Environment
- Affects production instances
- Real user traffic is at risk
4. Compliance Impact
- Likely violates security policies
- May trigger compliance audit findings
Recommendation: Address within 24 hoursFix Recommendations
Detailed guidance on how to resolve the drift:
Fix Recommendations
───────────────────
Option 1: Revert the Change (Recommended)
─────────────────────────────────────────
If SSH access from anywhere is not needed, remove the rule:
$ terraform apply
This will:
• Remove the 0.0.0.0/0 SSH rule
• Restore the security group to its defined state
• Take effect immediately
Option 2: Update Terraform (If Intentional)
────────────────────────────────────────────
If SSH access is legitimately needed, update your Terraform:
resource "aws_security_group" "web" {
name = "web-sg"
# Existing rules...
# Add SSH access (restrict to known IPs)
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["10.0.0.0/8"] # Internal only
description = "SSH from internal network"
}
}
Option 3: Use Bastion Host
──────────────────────────
For secure SSH access, consider a bastion host pattern:
module "bastion" {
source = "terraform-aws-modules/ec2-instance/aws"
# Bastion configuration...
}Prevention Recommendations
How to avoid similar drift in the future:
Prevention Recommendations
──────────────────────────
1. Implement AWS Config Rules
Create a rule to detect overly permissive security groups:
resource "aws_config_config_rule" "restricted_ssh" {
name = "restricted-ssh"
source {
owner = "AWS"
source_identifier = "INCOMING_SSH_DISABLED"
}
}
2. Use Service Control Policies (SCPs)
Prevent manual security group modifications:
{
"Effect": "Deny",
"Action": [
"ec2:AuthorizeSecurityGroupIngress"
],
"Resource": "*",
"Condition": {
"ForAnyValue:StringEquals": {
"ec2:IpRange": "0.0.0.0/0"
}
}
}
3. Enable CloudTrail Alerts
Alert on security group changes:
EventPattern: {
"source": ["aws.ec2"],
"detail-type": ["EC2 Security Group Changes"]
}
4. Regular Controlinfra Scans
Schedule daily scans to catch drift earlyViewing AI Analysis
In Scan Results
Click on any drift to expand the AI analysis:
┌──────────────────────────────────────────────────────────┐
│ aws_security_group.web 🔴 CRITICAL │
├──────────────────────────────────────────────────────────┤
│ Change: ingress rule modified │
│ │
│ [Root Cause] [Impact] [Fix] [Prevent] │
│ │
│ Root Cause Analysis │
│ ───────────────────── │
│ The security group was modified directly in AWS... │
│ │
│ [Show Full Analysis] [Copy Fix Code] [Create PR] │
└──────────────────────────────────────────────────────────┘Analysis Tabs
Switch between analysis sections:
| Tab | Content |
|---|---|
| Root Cause | Why the drift happened |
| Impact | What's affected |
| Fix | Code to resolve |
| Prevent | Future prevention |
AI Accuracy
Strengths
The AI excels at:
- Understanding Terraform resource relationships
- Identifying security implications
- Generating correct fix code
- Suggesting best practices
Limitations
Keep in mind:
- AI doesn't have real-time AWS access
- Context is limited to the diff and code
- Business context may be unknown
- Always review recommendations
Improving Results
For better analysis:
- Use descriptive resource names
- Add comments to Terraform code
- Include context in descriptions
- Provide variable files
Cost Management
AI analysis uses your API key (BYOK):
Estimated Costs
| Drifts | Anthropic (Claude) | OpenAI (GPT-4) |
|---|---|---|
| 1 | ~$0.01-0.03 | ~$0.02-0.05 |
| 10 | ~$0.10-0.30 | ~$0.20-0.50 |
| 50 | ~$0.50-1.50 | ~$1.00-2.50 |
Reducing Costs
- Fix drifts to reduce recurring analysis
- Use severity filters to analyze critical only
- Consider disabling AI for low-priority repos
Disabling AI Analysis
If you prefer scanning without AI:
- Go to Settings → AI Provider
- Remove API key or select None
- Click Save
Scans will still detect drift but without analysis.
Next Steps
- Configure AI Provider - Set up your API key
- Automated Fixes - Use AI-generated fixes
- Create Pull Requests - Auto-create fix PRs