Claude Code × AWS ECS/Fargate Complete Guide | Automate Safe Container Deployments

ECS/Fargate is a strong default when you want to run containers on AWS without managing EC2 instances. The hard part is not Docker itself. The hard part is wiring together ECR, task definitions, IAM roles, Secrets Manager, load balancing, CloudWatch Logs, health checks, and cost controls without leaving one critical gap.

This guide shows how to use Claude Code as an implementation assistant, not as an unchecked autopilot. A task definition is the launch spec for your container, Fargate is the serverless runtime that runs it, the execution role lets ECS pull images and secrets, and the task role is what your application uses when it calls AWS APIs.

Masa’s first painful Fargate migration did not fail because of Docker. It failed because the app needed roughly 40 seconds to boot and the ECS health check started too early. Claude Code can write the files quickly, but you must tell it the operational constraints: startup time, private networking, logs, rollback, and which IAM role owns which permission.

Target Architecture

Developer
  |
  | docker build / push
  v
Amazon ECR ----> Amazon ECS Service on AWS Fargate
                       |
                       | pulls secrets / writes logs
                       v
Secrets Manager     CloudWatch Logs
                       ^
                       |
Application Load Balancer -> /health -> Node.js container

The examples use ap-northeast-1, an existing VPC, and an existing ALB target group. If you want to create the network and load balancer with infrastructure as code, pair this with the internal Claude Code × AWS CloudFormation/CDK guide. For the permission model, read the Claude Code × AWS IAM guide before touching production.

Three Practical Use Cases

Use case	Why Fargate fits	Watch out for
SaaS REST API	Keep two or more tasks behind an ALB and deploy without managing servers	Design DB connections and health checks first
Admin backend	Start small without EC2 patching or AMIs	Setting `desiredCount` to 0 can make the first request slow
Batch plus API image	Reuse the same image for service and `run-task` jobs	Keep the task role narrowly scoped

If the workload is event-driven and short, the internal Claude Code × AWS Lambda guide may be simpler. Choose Fargate when you need always-on HTTP, longer requests, Docker parity, or a runtime that Lambda does not fit well.

1. Minimal API with a Health Check

Start with an app that works locally. The /health endpoint is used by both ECS and the ALB, so keep it lightweight. If every temporary database slowdown returns 500, ECS may replace healthy tasks during a normal dependency blip.

{
  "scripts": {
    "start": "node src/server.js"
  },
  "dependencies": {
    "express": "^4.19.2"
  }
}

// src/server.js
const express = require("express");

const app = express();
const port = Number(process.env.PORT || 3000);

app.get("/health", (_req, res) => {
  res.status(200).json({
    ok: true,
    service: "myapp",
    time: new Date().toISOString(),
  });
});

app.get("/", (_req, res) => {
  res.json({ message: "Hello from ECS Fargate" });
});

app.listen(port, "0.0.0.0", () => {
  console.log(`myapp listening on ${port}`);
});

# Dockerfile
FROM node:22-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev

COPY src ./src
ENV NODE_ENV=production
ENV PORT=3000
EXPOSE 3000

CMD ["node", "src/server.js"]

Verify curl -f http://localhost:3000/health before moving to ECS. When prompting Claude Code, specify that ALB and ECS use the same health path, the app listens on 0.0.0.0, and startup time is protected by startPeriod.

2. Build and Push to ECR

This script creates the repository if needed, logs in, builds the image, and pushes a versioned tag. Avoid relying on latest alone because rollback and audit work become harder.

set -euo pipefail

export AWS_REGION="ap-northeast-1"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export ECR_REPOSITORY="myapp"
export IMAGE_TAG="$(git rev-parse --short HEAD 2>/dev/null || date +%Y%m%d%H%M%S)"
export IMAGE_URI="${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${ECR_REPOSITORY}:${IMAGE_TAG}"

aws ecr describe-repositories \
  --repository-names "${ECR_REPOSITORY}" \
  --region "${AWS_REGION}" >/dev/null 2>&1 || \
aws ecr create-repository \
  --repository-name "${ECR_REPOSITORY}" \
  --image-scanning-configuration scanOnPush=true \
  --region "${AWS_REGION}"

aws ecr get-login-password --region "${AWS_REGION}" | \
  docker login --username AWS --password-stdin \
  "${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com"

docker build -t "${IMAGE_URI}" .
docker push "${IMAGE_URI}"

echo "Pushed ${IMAGE_URI}"

AWS’s official ECR workflow uses get-login-password. In CI, prefer GitHub Actions OIDC and short-lived AWS credentials instead of long-lived access keys stored in repository secrets.

3. Register the ECS Task Definition

Fargate tasks use awsvpc networking. When a task injects a Secrets Manager value, the execution role needs secretsmanager:GetSecretValue; if the secret uses a customer-managed KMS key, it also needs kms:Decrypt. The task role is for your application code, not for image pull or log setup.

set -euo pipefail

export AWS_REGION="ap-northeast-1"
export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)"
export IMAGE_URI="${IMAGE_URI:?Run the ECR push script first}"
export EXECUTION_ROLE_ARN="arn:aws:iam::${AWS_ACCOUNT_ID}:role/ecsTaskExecutionRole"
export TASK_ROLE_ARN="arn:aws:iam::${AWS_ACCOUNT_ID}:role/myapp-task-role"
export SECRET_ARN="arn:aws:secretsmanager:${AWS_REGION}:${AWS_ACCOUNT_ID}:secret:prod/myapp/DATABASE_URL"

aws logs create-log-group --log-group-name /ecs/myapp --region "${AWS_REGION}" 2>/dev/null || true
aws logs put-retention-policy --log-group-name /ecs/myapp --retention-in-days 30 --region "${AWS_REGION}"

cat > ecs-task-definition.json <<EOF
{
  "family": "myapp-task",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "${EXECUTION_ROLE_ARN}",
  "taskRoleArn": "${TASK_ROLE_ARN}",
  "runtimePlatform": {
    "cpuArchitecture": "X86_64",
    "operatingSystemFamily": "LINUX"
  },
  "containerDefinitions": [
    {
      "name": "app",
      "image": "${IMAGE_URI}",
      "essential": true,
      "portMappings": [
        { "containerPort": 3000, "hostPort": 3000, "protocol": "tcp" }
      ],
      "environment": [
        { "name": "NODE_ENV", "value": "production" },
        { "name": "PORT", "value": "3000" }
      ],
      "secrets": [
        { "name": "DATABASE_URL", "valueFrom": "${SECRET_ARN}" }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/myapp",
          "awslogs-region": "${AWS_REGION}",
          "awslogs-stream-prefix": "ecs"
        }
      },
      "healthCheck": {
        "command": ["CMD-SHELL", "wget -qO- http://localhost:3000/health || exit 1"],
        "interval": 30,
        "timeout": 5,
        "retries": 3,
        "startPeriod": 60
      }
    }
  ]
}
EOF

aws ecs register-task-definition \
  --cli-input-json file://ecs-task-definition.json \
  --region "${AWS_REGION}"

startPeriod is easy to forget and expensive to debug. It prevents ECS from judging the container before the application has had a reasonable chance to boot.

4. Create the Fargate Service

The following command uses existing private subnets, a task security group, and an ALB target group. The task security group should allow port 3000 only from the ALB security group.

set -euo pipefail

export AWS_REGION="ap-northeast-1"
export CLUSTER_NAME="myapp-cluster"
export SERVICE_NAME="myapp-service"
export TASK_FAMILY="myapp-task"
export SUBNET_1="subnet-xxxxxxxx"
export SUBNET_2="subnet-yyyyyyyy"
export TASK_SECURITY_GROUP="sg-xxxxxxxx"
export TARGET_GROUP_ARN="arn:aws:elasticloadbalancing:ap-northeast-1:123456789012:targetgroup/myapp/abc123"

aws ecs create-cluster \
  --cluster-name "${CLUSTER_NAME}" \
  --region "${AWS_REGION}" >/dev/null

aws ecs create-service \
  --cluster "${CLUSTER_NAME}" \
  --service-name "${SERVICE_NAME}" \
  --task-definition "${TASK_FAMILY}" \
  --desired-count 2 \
  --launch-type FARGATE \
  --platform-version LATEST \
  --health-check-grace-period-seconds 90 \
  --network-configuration "awsvpcConfiguration={subnets=[${SUBNET_1},${SUBNET_2}],securityGroups=[${TASK_SECURITY_GROUP}],assignPublicIp=DISABLED}" \
  --load-balancers "targetGroupArn=${TARGET_GROUP_ARN},containerName=app,containerPort=3000" \
  --region "${AWS_REGION}"

aws ecs wait services-stable \
  --cluster "${CLUSTER_NAME}" \
  --services "${SERVICE_NAME}" \
  --region "${AWS_REGION}"

For private subnets with assignPublicIp=DISABLED, the task still needs a path to ECR, CloudWatch Logs, and Secrets Manager. Use NAT Gateway or VPC endpoints. NAT is convenient, but it can dominate the cost of a small test environment.

5. Check CloudWatch Logs and ECS Events

ECS troubleshooting usually spans service events, stopped task reasons, and application logs. Give Claude Code all three when you ask it to diagnose a failed deployment.

export AWS_REGION="ap-northeast-1"
export CLUSTER_NAME="myapp-cluster"
export SERVICE_NAME="myapp-service"

aws ecs describe-services \
  --cluster "${CLUSTER_NAME}" \
  --services "${SERVICE_NAME}" \
  --query "services[0].events[0:5].[createdAt,message]" \
  --output table \
  --region "${AWS_REGION}"

aws ecs list-tasks \
  --cluster "${CLUSTER_NAME}" \
  --service-name "${SERVICE_NAME}" \
  --desired-status STOPPED \
  --region "${AWS_REGION}"

aws logs tail /ecs/myapp \
  --follow \
  --since 10m \
  --region "${AWS_REGION}"

The common failure modes are: image pull blocked, secret access denied, ALB cannot reach the task security group, or the app listens on localhost instead of 0.0.0.0.

Claude Code Implementation Prompt

Build an AWS ECS/Fargate deployment implementation for a Node.js API.

Context:
- Region: ap-northeast-1
- ECR repository: myapp
- Container port: 3000
- Health endpoint: /health
- ECS launch type: FARGATE
- Network mode: awsvpc
- Desired count: 2
- Task CPU/memory: 512 / 1024
- Secret: inject DATABASE_URL from Secrets Manager
- Logs: CloudWatch Logs /ecs/myapp, retention 30 days

Deliverables:
1. Production Dockerfile
2. Bash script to push to ECR
3. Bash script to register the ECS task definition
4. Bash script to create the Fargate service
5. Bash script to inspect CloudWatch Logs
6. Explanation that separates execution role and task role

Constraints:
- No pseudocode. Commands must be runnable with AWS CLI after variables are filled.
- Do not hardcode secret values.
- Do not expose the task directly in a public subnet.
- End with official AWS documentation points I should verify.

Official AWS Checks

AWS Fargate guide: confirm Fargate constraints and platform behavior.
Task definition parameters: confirm awsvpc, CPU/memory, and health check fields.
Task execution IAM role: confirm permissions for ECR pull, logs, and Secrets Manager.
Sending ECS logs to CloudWatch: confirm the awslogs driver setup.
AWS Fargate pricing: pricing varies by Region, vCPU, memory, extra ephemeral storage, and running time.
AWS regional service list: confirm the target Region supports every service in your design.

Pitfalls

The first pitfall is mixing up the execution role and the task role. Add ECR, logging, and secret retrieval permissions to the execution role. Add DynamoDB, S3, SQS, or application permissions to the task role.

The second is a secret in the wrong Region. The ECS task, Secrets Manager secret, and KMS key must line up. In multi-Region systems, make the secret ARN an environment-specific deployment variable.

The third is cost. Fargate is usage-based, but ALB, NAT Gateway, CloudWatch Logs, and ECR storage also matter. For test environments, set log retention, stop unused services, and review NAT usage.

The fourth is an overly strict health check. Keep /health lightweight and move deep dependency checks to /ready or a smoke test.

The fifth is image architecture. If you build ARM64 images on Apple Silicon but define the task as X86_64, startup fails. Use docker buildx build --platform linux/amd64 or align runtimePlatform.

CTA and Next Step

Solo builders can copy these scripts into a small sample API and use the free Claude Code cheatsheet to improve their prompts. Teams that need ECS, IAM, CI/CD, observability, and rollback designed together should start with Claude Code training and consultation. For reusable prompts and review material, use the ClaudeCodeLab products.

Summary

ECS/Fargate is not just a place to put Docker images. It is a combined design problem across IAM, networking, secrets, logs, health checks, and cost. Claude Code is most useful when you give it the operational rules first and ask for runnable files second.

After trying this workflow in practice, the biggest improvement came from the prompt, not from the Dockerfile. Explicitly asking Claude Code to separate execution role and task role, include CloudWatch log commands, and set health check grace periods prevented the same failure from repeating. The first run still exposed a missing secret permission, but passing ECS events and logs back to Claude Code produced the IAM fix and redeploy steps in one pass.

Claude Code × AWS ECS/Fargate Complete Guide | Automate Safe Container Deployments

Target Architecture

Three Practical Use Cases

1. Minimal API with a Health Check

2. Build and Push to ECR

3. Register the ECS Task Definition

4. Create the Fargate Service

5. Check CloudWatch Logs and ECS Events

Claude Code Implementation Prompt

Official AWS Checks

Pitfalls

CTA and Next Step

Summary

Free PDF: Claude Code Cheatsheet

Level up your Claude Code workflow

Related Posts

Claude Code Obsidian to CLAUDE.md Workflow: Stop Re-explaining Context

Claude Code Revenue CTA Routing: Send Articles to PDF, Gumroad, and Consultation

Claude Code Team Handoff Rules: Review Evidence, Permissions, Rollback, and Revenue Paths

Related Products

50 Battle-Tested Claude Code Prompt Templates

The Complete Claude Code Setup & Configuration Guide