Skip to content

AWS Cloud Discovery

Discover all AWS resources across your accounts and regions, identify unmanaged infrastructure, and bring it under Terraform management.

Prerequisites

  1. AWS credentials configured in Settings > Cloud
  2. Runner — either the Controlinfra cloud runner (default) or a self-hosted runner connected to your organization
  3. ReadOnlyAccess (or finer-grained equivalents) on each account you want to scan

Cloud Runner vs Self-Hosted

Cloud runners work out of the box and require no infrastructure on your side. Use a self-hosted runner when your state-file bucket has a VPC endpoint policy that restricts access to specific networks, or when compliance prohibits cloud-runner credential handoff.

Required Permissions

The IAM identity used for discovery needs:

PermissionScopePurpose
ReadOnlyAccessAccountEnumerate resources across services
s3:GetObject + s3:ListBucketState bucketsScan Terraform state files
organizations:DescribeAccountOrg-level (optional)Resolve friendly account names

For least-privilege access, the AWS managed policy arn:aws:iam::aws:policy/ReadOnlyAccess is the simplest starting point. Refine to per-service *:Describe* / *:List* actions for production.

bash
# Attach ReadOnlyAccess to a role
aws iam attach-role-policy \
  --role-name controlinfra-discovery \
  --policy-arn arn:aws:iam::aws:policy/ReadOnlyAccess

Multi-Account Access

For multi-account discovery, create a CloudAccount entry per AWS account in Settings > Cloud. Each account can use a different auth method (access keys, AssumeRole, OIDC, or instance profile) — Controlinfra fans out the scan across them in parallel.

Running a Discovery Scan

  1. Go to Cloud Discovery in the sidebar
  2. Click Run Discovery
  3. Select AWS as the provider
  4. Configure:
    • Account — pick which CloudAccount to scan (or "All accounts" for parallel fan-out)
    • Services — choose AWS services to scan (EC2, IAM, S3, RDS, Lambda, etc.)
    • Regions — select regions to include; defaults to the account's primary region
    • State Files (optional) — list S3 buckets where .tfstate files are stored
  5. Click Start Scan

How It Works

Phase 1: Resource Discovery

Controlinfra calls per-service AWS SDK clients (EC2.DescribeInstances, S3.ListBuckets, IAM.ListUsers, etc.) in parallel. Each discoverer uses pagination cursors instead of in-memory accumulation so scans of accounts with millions of resources stay within the runner's memory budget.

Phase 2: Enrichment

For key resource types, Controlinfra makes additional API calls to capture detailed configuration:

ServiceDetails Captured
EC2Instance type, AMI, VPC, security groups, EBS volumes, key pair, IAM profile
EKSKubernetes version, node groups, Fargate profiles, OIDC provider
RDSEngine + version, instance class, storage, backup retention, encryption
LambdaRuntime, memory, timeout, env vars (redacted), VPC config
S3Versioning, encryption, public access block, lifecycle rules
IAMPolicies attached, trust policy, last-used timestamps
VPCCIDRs, subnets, route tables, NAT/Internet gateways, flow logs

Phase 3: State Matching

If you provided one or more S3 state buckets, Controlinfra downloads and parses .tfstate objects. Each discovered resource is matched against the state index using its ARN (or service-specific identifier when no ARN exists). Matched = managed. Unmatched = orphan.

Phase 4: Cost Estimation

Orphaned resources are priced using static per-resource estimators (EC2 instance types, RDS classes, EBS volume types, NAT gateways, etc.) with region multipliers from awsPricing.js. Resources without a precise estimator fall back to a flat per-month rate so orphan-cost ranking stays directionally useful.

Phase 5: AI Analysis

Claude AI analyzes orphaned resources to identify:

  • Cleanup priorities and cost savings
  • Security risks (public S3 buckets, overly permissive security groups, unencrypted volumes)
  • Patterns (e.g., abandoned VPCs, dangling EBS snapshots, unused EIPs)

Supported Resource Types

Controlinfra maps 40+ AWS resource types to their Terraform equivalents:

CategoryResource Types
ComputeEC2 instances, AMIs, launch templates, Auto Scaling groups, Lambda
ContainersECS clusters/services/tasks, EKS clusters/node groups/Fargate, ECR
NetworkingVPCs, subnets, route tables, IGWs, NATs, EIPs, security groups, NACLs, load balancers (ALB/NLB/CLB), CloudFront, API Gateway, Route 53 zones
StorageS3 buckets, EBS volumes/snapshots, EFS filesystems
DatabaseRDS instances/clusters/snapshots, DynamoDB tables, ElastiCache, Redshift
SecurityIAM users/roles/policies/groups, KMS keys, Secrets Manager, ACM, WAF, GuardDuty, Security Hub
ObservabilityCloudWatch alarms/log groups, CloudTrail trails, X-Ray
IntegrationSNS topics, SQS queues, EventBridge rules, Step Functions, Kinesis streams
Backup / DRAWS Backup vaults, EBS snapshots, RDS snapshots

Terraform Code Generation

For orphaned resources, Controlinfra generates:

  1. HCL resource blocks with actual configuration from the discovered resource
  2. terraform import commands keyed on the ARN or service-native ID
  3. Provider configuration with the aws provider and an S3 backend pointing at your validation bucket

Troubleshooting

"AWS credentials not configured"

Add an AWS account in Settings > Cloud > Add Account. For long-lived credentials use the credentials auth method; for production prefer assume_role (cross-account IAM trust) or oidc (zero-credential federation).

"AccessDenied" on a specific service

The IAM identity is missing read permission for that service. Either attach the broader ReadOnlyAccess managed policy or add the specific *:Describe* / *:List* actions for the service. The scan will continue and surface the service in the partial-coverage banner — the rest of the run completes.

Account not showing in the picker

The Account dropdown lists active CloudAccount entries (status active, not paused, not deleted). If yours is missing, check Settings > Cloud — the account may need re-validation or may have been auto-paused by a plan limit.

Empty state-file results

Confirm the bucket name is correct (no s3:// prefix), that the discovery identity has s3:GetObject + s3:ListBucket on it, and that .tfstate objects actually exist at the path. Use Validate & Import on an orphan to test end-to-end.

Cross-region resources missing

The Regions field controls which regions the scan enumerates. Global services (IAM, CloudFront, Route 53) are always included regardless of region selection — region filters never silently drop them.

Missing resource types

Not all AWS resource types map to a Terraform resource (yet). Resources without a mapping are still discovered and shown but are marked "unknown" — they don't appear in the IaC coverage percentage so they don't skew the metric.