Skip to content

AI Analysis

Leverage AI-powered insights to understand and resolve infrastructure drift.

Overview

When drift is detected, Controlinfra uses AI (Claude or GPT-4) to analyze each change and provide:

  • Root Cause Analysis: Why the drift occurred
  • Impact Assessment: What systems are affected
  • Blast Radius: Scope of potential issues
  • Severity Scoring: How urgent is the fix
  • Fix Recommendations: Code to resolve the drift
  • Prevention Tips: How to avoid future drift

How It Works

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Drift     │     │     AI      │     │   Analysis  │
│  Detected   │────▶│   Model     │────▶│   Report    │
│  (changes)  │     │ (Claude/GPT)│     │  (insights) │
└─────────────┘     └─────────────┘     └─────────────┘

The AI receives:

  • The Terraform resource configuration
  • The detected changes (diff)
  • Resource type and context
  • Previous drift patterns (if any)

Analysis Components

Root Cause Analysis

The AI examines the drift and identifies likely causes:

Root Cause Analysis
───────────────────

The AWS security group "web-sg" was modified directly in the
AWS console to add an inbound rule allowing SSH (port 22)
from 0.0.0.0/0.

Likely scenario: A developer needed emergency SSH access for
debugging and added the rule manually rather than through
Terraform. This is a common pattern when quick access is
needed but proper change management wasn't followed.

Evidence:
• The change adds a permissive SSH rule
• The rule description contains "temp" suggesting temporary access
• Similar patterns have occurred in this repository before

Impact Assessment

Understanding what systems and users are affected:

Impact Assessment
─────────────────

Direct Impact:
• 3 EC2 instances in the web-tier security group
• Production traffic (12,000 requests/hour) flows through these

Secondary Impact:
• Application servers become reachable from the internet
• SSH brute force attacks become possible
• Compliance audit may flag this as a violation

Business Impact:
• PCI-DSS compliance at risk
• Potential security incident if exploited
• May trigger security team alerts

Blast Radius

Visualizing the scope of affected resources:

Blast Radius
────────────

Security Group: web-sg
├── EC2: web-server-1 (i-abc123)
├── EC2: web-server-2 (i-def456)
├── EC2: web-server-3 (i-ghi789)
└── ALB: web-load-balancer

Downstream dependencies:
├── RDS: main-database (accessible from web tier)
└── ElastiCache: session-store

Total affected resources: 7

Severity Explanation

Why the AI assigned a particular severity:

Severity: CRITICAL 🔴
────────────────────

This drift is classified as CRITICAL because:

1. Security Exposure (Primary Factor)
   - Port 22 (SSH) is exposed to 0.0.0.0/0
   - This allows access from any IP address globally

2. Resource Type (Security Group)
   - Security groups are security-critical resources
   - Changes have immediate network-level impact

3. Production Environment
   - Affects production instances
   - Real user traffic is at risk

4. Compliance Impact
   - Likely violates security policies
   - May trigger compliance audit findings

Recommendation: Address within 24 hours

Fix Recommendations

Detailed guidance on how to resolve the drift:

Fix Recommendations
───────────────────

Option 1: Revert the Change (Recommended)
─────────────────────────────────────────
If SSH access from anywhere is not needed, remove the rule:

$ terraform apply

This will:
• Remove the 0.0.0.0/0 SSH rule
• Restore the security group to its defined state
• Take effect immediately


Option 2: Update Terraform (If Intentional)
────────────────────────────────────────────
If SSH access is legitimately needed, update your Terraform:

resource "aws_security_group" "web" {
  name = "web-sg"

  # Existing rules...

  # Add SSH access (restrict to known IPs)
  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/8"]  # Internal only
    description = "SSH from internal network"
  }
}


Option 3: Use Bastion Host
──────────────────────────
For secure SSH access, consider a bastion host pattern:

module "bastion" {
  source = "terraform-aws-modules/ec2-instance/aws"
  # Bastion configuration...
}

Prevention Recommendations

How to avoid similar drift in the future:

Prevention Recommendations
──────────────────────────

1. Implement AWS Config Rules
   Create a rule to detect overly permissive security groups:

   resource "aws_config_config_rule" "restricted_ssh" {
     name = "restricted-ssh"
     source {
       owner             = "AWS"
       source_identifier = "INCOMING_SSH_DISABLED"
     }
   }

2. Use Service Control Policies (SCPs)
   Prevent manual security group modifications:

   {
     "Effect": "Deny",
     "Action": [
       "ec2:AuthorizeSecurityGroupIngress"
     ],
     "Resource": "*",
     "Condition": {
       "ForAnyValue:StringEquals": {
         "ec2:IpRange": "0.0.0.0/0"
       }
     }
   }

3. Enable CloudTrail Alerts
   Alert on security group changes:

   EventPattern: {
     "source": ["aws.ec2"],
     "detail-type": ["EC2 Security Group Changes"]
   }

4. Regular Controlinfra Scans
   Schedule daily scans to catch drift early

Viewing AI Analysis

In Scan Results

Click on any drift to expand the AI analysis:

┌──────────────────────────────────────────────────────────┐
│ aws_security_group.web                          🔴 CRITICAL │
├──────────────────────────────────────────────────────────┤
│ Change: ingress rule modified                            │
│                                                          │
│ [Root Cause] [Impact] [Fix] [Prevent]                   │
│                                                          │
│ Root Cause Analysis                                      │
│ ─────────────────────                                    │
│ The security group was modified directly in AWS...       │
│                                                          │
│ [Show Full Analysis] [Copy Fix Code] [Create PR]        │
└──────────────────────────────────────────────────────────┘

Analysis Tabs

Switch between analysis sections:

TabContent
Root CauseWhy the drift happened
ImpactWhat's affected
FixCode to resolve
PreventFuture prevention

AI Accuracy

Strengths

The AI excels at:

  • Understanding Terraform resource relationships
  • Identifying security implications
  • Generating correct fix code
  • Suggesting best practices

Limitations

Keep in mind:

  • AI doesn't have real-time AWS access
  • Context is limited to the diff and code
  • Business context may be unknown
  • Always review recommendations

Improving Results

For better analysis:

  • Use descriptive resource names
  • Add comments to Terraform code
  • Include context in descriptions
  • Provide variable files

Cost Management

AI analysis uses your API key (BYOK):

Estimated Costs

DriftsAnthropic (Claude)OpenAI (GPT-4)
1~$0.01-0.03~$0.02-0.05
10~$0.10-0.30~$0.20-0.50
50~$0.50-1.50~$1.00-2.50

Reducing Costs

  • Fix drifts to reduce recurring analysis
  • Use severity filters to analyze critical only
  • Consider disabling AI for low-priority repos

Disabling AI Analysis

If you prefer scanning without AI:

  1. Go to SettingsAI Provider
  2. Remove API key or select None
  3. Click Save

Scans will still detect drift but without analysis.

Next Steps

AI-powered infrastructure drift detection