AWS Cloud Discovery
Discover all AWS resources across your accounts and regions, identify unmanaged infrastructure, and bring it under Terraform management.
Prerequisites
- AWS credentials configured in Settings > Cloud
- Runner — either the Controlinfra cloud runner (default) or a self-hosted runner connected to your organization
ReadOnlyAccess(or finer-grained equivalents) on each account you want to scan
Cloud Runner vs Self-Hosted
Cloud runners work out of the box and require no infrastructure on your side. Use a self-hosted runner when your state-file bucket has a VPC endpoint policy that restricts access to specific networks, or when compliance prohibits cloud-runner credential handoff.
Required Permissions
The IAM identity used for discovery needs:
| Permission | Scope | Purpose |
|---|---|---|
ReadOnlyAccess | Account | Enumerate resources across services |
s3:GetObject + s3:ListBucket | State buckets | Scan Terraform state files |
organizations:DescribeAccount | Org-level (optional) | Resolve friendly account names |
For least-privilege access, the AWS managed policy arn:aws:iam::aws:policy/ReadOnlyAccess is the simplest starting point. Refine to per-service *:Describe* / *:List* actions for production.
# Attach ReadOnlyAccess to a role
aws iam attach-role-policy \
--role-name controlinfra-discovery \
--policy-arn arn:aws:iam::aws:policy/ReadOnlyAccessMulti-Account Access
For multi-account discovery, create a CloudAccount entry per AWS account in Settings > Cloud. Each account can use a different auth method (access keys, AssumeRole, OIDC, or instance profile) — Controlinfra fans out the scan across them in parallel.
Running a Discovery Scan
- Go to Cloud Discovery in the sidebar
- Click Run Discovery
- Select AWS as the provider
- Configure:
- Account — pick which CloudAccount to scan (or "All accounts" for parallel fan-out)
- Services — choose AWS services to scan (EC2, IAM, S3, RDS, Lambda, etc.)
- Regions — select regions to include; defaults to the account's primary region
- State Files (optional) — list S3 buckets where
.tfstatefiles are stored
- Click Start Scan
How It Works
Phase 1: Resource Discovery
Controlinfra calls per-service AWS SDK clients (EC2.DescribeInstances, S3.ListBuckets, IAM.ListUsers, etc.) in parallel. Each discoverer uses pagination cursors instead of in-memory accumulation so scans of accounts with millions of resources stay within the runner's memory budget.
Phase 2: Enrichment
For key resource types, Controlinfra makes additional API calls to capture detailed configuration:
| Service | Details Captured |
|---|---|
| EC2 | Instance type, AMI, VPC, security groups, EBS volumes, key pair, IAM profile |
| EKS | Kubernetes version, node groups, Fargate profiles, OIDC provider |
| RDS | Engine + version, instance class, storage, backup retention, encryption |
| Lambda | Runtime, memory, timeout, env vars (redacted), VPC config |
| S3 | Versioning, encryption, public access block, lifecycle rules |
| IAM | Policies attached, trust policy, last-used timestamps |
| VPC | CIDRs, subnets, route tables, NAT/Internet gateways, flow logs |
Phase 3: State Matching
If you provided one or more S3 state buckets, Controlinfra downloads and parses .tfstate objects. Each discovered resource is matched against the state index using its ARN (or service-specific identifier when no ARN exists). Matched = managed. Unmatched = orphan.
Phase 4: Cost Estimation
Orphaned resources are priced using static per-resource estimators (EC2 instance types, RDS classes, EBS volume types, NAT gateways, etc.) with region multipliers from awsPricing.js. Resources without a precise estimator fall back to a flat per-month rate so orphan-cost ranking stays directionally useful.
Phase 5: AI Analysis
Claude AI analyzes orphaned resources to identify:
- Cleanup priorities and cost savings
- Security risks (public S3 buckets, overly permissive security groups, unencrypted volumes)
- Patterns (e.g., abandoned VPCs, dangling EBS snapshots, unused EIPs)
Supported Resource Types
Controlinfra maps 40+ AWS resource types to their Terraform equivalents:
| Category | Resource Types |
|---|---|
| Compute | EC2 instances, AMIs, launch templates, Auto Scaling groups, Lambda |
| Containers | ECS clusters/services/tasks, EKS clusters/node groups/Fargate, ECR |
| Networking | VPCs, subnets, route tables, IGWs, NATs, EIPs, security groups, NACLs, load balancers (ALB/NLB/CLB), CloudFront, API Gateway, Route 53 zones |
| Storage | S3 buckets, EBS volumes/snapshots, EFS filesystems |
| Database | RDS instances/clusters/snapshots, DynamoDB tables, ElastiCache, Redshift |
| Security | IAM users/roles/policies/groups, KMS keys, Secrets Manager, ACM, WAF, GuardDuty, Security Hub |
| Observability | CloudWatch alarms/log groups, CloudTrail trails, X-Ray |
| Integration | SNS topics, SQS queues, EventBridge rules, Step Functions, Kinesis streams |
| Backup / DR | AWS Backup vaults, EBS snapshots, RDS snapshots |
Terraform Code Generation
For orphaned resources, Controlinfra generates:
- HCL resource blocks with actual configuration from the discovered resource
terraform importcommands keyed on the ARN or service-native ID- Provider configuration with the
awsprovider and an S3 backend pointing at your validation bucket
Troubleshooting
"AWS credentials not configured"
Add an AWS account in Settings > Cloud > Add Account. For long-lived credentials use the credentials auth method; for production prefer assume_role (cross-account IAM trust) or oidc (zero-credential federation).
"AccessDenied" on a specific service
The IAM identity is missing read permission for that service. Either attach the broader ReadOnlyAccess managed policy or add the specific *:Describe* / *:List* actions for the service. The scan will continue and surface the service in the partial-coverage banner — the rest of the run completes.
Account not showing in the picker
The Account dropdown lists active CloudAccount entries (status active, not paused, not deleted). If yours is missing, check Settings > Cloud — the account may need re-validation or may have been auto-paused by a plan limit.
Empty state-file results
Confirm the bucket name is correct (no s3:// prefix), that the discovery identity has s3:GetObject + s3:ListBucket on it, and that .tfstate objects actually exist at the path. Use Validate & Import on an orphan to test end-to-end.
Cross-region resources missing
The Regions field controls which regions the scan enumerates. Global services (IAM, CloudFront, Route 53) are always included regardless of region selection — region filters never silently drop them.
Missing resource types
Not all AWS resource types map to a Terraform resource (yet). Resources without a mapping are still discovered and shown but are marked "unknown" — they don't appear in the IaC coverage percentage so they don't skew the metric.