🎉 DevOps Interview Prep Bundle is live — 1000+ Q&A across 20 topicsGet it →
All Articles

Build an AI-Powered Dockerfile Optimizer Using Claude API

Feed any Dockerfile to Claude and get back a production-ready version with smaller image size, better layer caching, security fixes, and an explanation of every change.

DevOpsBoysMay 14, 20265 min read
Share:Tweet

Badly written Dockerfiles are everywhere. 1.2GB images that should be 80MB. No layer caching. Running as root. Secrets baked in. Here's how to build an AI tool that fixes all of it automatically.


What You'll Build

Input: Any Dockerfile (paste or file path)
       ↓
Claude API analyzes:
  - Base image optimization
  - Layer ordering for cache efficiency
  - Multi-stage build opportunities
  - Security issues (root user, exposed secrets)
  - Package manager cleanup
       ↓
Output:
  - Optimized Dockerfile
  - Explanation of every change
  - Estimated image size reduction

Setup

bash
pip install anthropic
export ANTHROPIC_API_KEY=sk-ant-your-key

The Optimizer Script

python
# dockerfile_optimizer.py
import anthropic
import sys
import os
from pathlib import Path
 
def read_dockerfile(path: str) -> str:
    with open(path, 'r') as f:
        return f.read()
 
 
def optimize_dockerfile(dockerfile_content: str) -> dict:
    client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
    
    prompt = f"""You are a Docker expert specializing in production-grade Dockerfiles.
 
Analyze this Dockerfile and provide:
 
1. **ISSUES FOUND** — List every problem with severity (HIGH/MEDIUM/LOW):
   - Security issues (running as root, hardcoded secrets, etc.)
   - Layer caching problems (COPY before package installs, etc.)
   - Image size issues (dev dependencies included, no cleanup, wrong base image)
   - Best practice violations
 
2. **OPTIMIZED DOCKERFILE** — The improved version between ```dockerfile``` markers.
   Apply all optimizations:
   - Use multi-stage builds where appropriate
   - Order layers for maximum cache efficiency (rarely-changing first)
   - Use specific version tags (not 'latest')
   - Add non-root USER
   - Clean up package manager caches
   - Use .dockerignore recommendations as a comment
   - Use the smallest appropriate base image (alpine, distroless, slim)
 
3. **CHANGES EXPLAINED** — For each change, one line explaining WHY
 
4. **ESTIMATED SIZE REDUCTION** — Rough estimate of image size improvement
 
Here is the Dockerfile to optimize:
 
```dockerfile
{dockerfile_content}
```"""
 
    message = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=4096,
        messages=[{"role": "user", "content": prompt}]
    )
    
    response_text = message.content[0].text
    
    # Extract the optimized Dockerfile from response
    optimized = ""
    in_dockerfile = False
    lines = response_text.split('\n')
    
    for i, line in enumerate(lines):
        if '```dockerfile' in line.lower() and not in_dockerfile:
            in_dockerfile = True
            continue
        elif line.strip() == '```' and in_dockerfile:
            in_dockerfile = False
            continue
        elif in_dockerfile:
            optimized += line + '\n'
    
    return {
        "analysis": response_text,
        "optimized_dockerfile": optimized.strip()
    }
 
 
def save_results(original_path: str, result: dict):
    # Save optimized Dockerfile
    output_dir = Path(original_path).parent
    optimized_path = output_dir / "Dockerfile.optimized"
    analysis_path = output_dir / "dockerfile-analysis.md"
    
    with open(optimized_path, 'w') as f:
        f.write(result["optimized_dockerfile"])
    
    with open(analysis_path, 'w') as f:
        f.write("# Dockerfile Analysis\n\n")
        f.write(result["analysis"])
    
    print(f"✅ Optimized Dockerfile saved: {optimized_path}")
    print(f"📄 Full analysis saved: {analysis_path}")
 
 
def main():
    if len(sys.argv) < 2:
        print("Usage: python dockerfile_optimizer.py <path-to-Dockerfile>")
        print("       python dockerfile_optimizer.py -  (read from stdin)")
        sys.exit(1)
    
    path = sys.argv[1]
    
    if path == "-":
        print("Paste your Dockerfile (Ctrl+D when done):")
        dockerfile_content = sys.stdin.read()
        save_output = False
    else:
        if not os.path.exists(path):
            print(f"Error: File not found: {path}")
            sys.exit(1)
        dockerfile_content = read_dockerfile(path)
        save_output = True
    
    print("🔍 Analyzing Dockerfile...")
    result = optimize_dockerfile(dockerfile_content)
    
    print("\n" + "="*60)
    print(result["analysis"])
    print("="*60 + "\n")
    
    if save_output and result["optimized_dockerfile"]:
        save_results(path, result)
 
 
if __name__ == "__main__":
    main()

Example: What It Does to a Bad Dockerfile

Input Dockerfile:

dockerfile
FROM node:latest
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]

Claude's analysis output:

ISSUES FOUND:
[HIGH] Using 'latest' tag — non-deterministic builds
[HIGH] Running as root — security vulnerability
[HIGH] node:latest is 1.1GB — unnecessarily large
[MEDIUM] COPY . before npm install — cache invalidated on any file change
[MEDIUM] Dev dependencies included in production image
[MEDIUM] No .dockerignore — node_modules likely copied in
[LOW] No health check defined

Optimized Dockerfile:

dockerfile
# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force
 
# Stage 2: Runtime
FROM node:20-alpine AS runtime
WORKDIR /app
 
# Create non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
 
# Copy only built artifacts and production deps
COPY --from=builder /app/node_modules ./node_modules
COPY --chown=appuser:appgroup . .
RUN npm run build && rm -rf src
 
USER appuser
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s CMD wget -qO- http://localhost:3000/health || exit 1
CMD ["node", "dist/index.js"]
 
# .dockerignore recommendations:
# node_modules
# .git
# *.md
# .env*
# tests/

Estimated size reduction: 1.1GB → ~120MB (89% smaller)


Add a Web Interface (Optional)

python
# app.py — simple Flask web UI
from flask import Flask, request, jsonify, render_template_string
from dockerfile_optimizer import optimize_dockerfile
 
app = Flask(__name__)
 
HTML = """
<!DOCTYPE html>
<html>
<head><title>Dockerfile Optimizer</title></head>
<body>
  <h1>AI Dockerfile Optimizer</h1>
  <textarea id="input" rows="20" cols="80" placeholder="Paste your Dockerfile here..."></textarea>
  <br>
  <button onclick="optimize()">Optimize</button>
  <pre id="output"></pre>
  <script>
    async function optimize() {
      const dockerfile = document.getElementById('input').value;
      document.getElementById('output').textContent = 'Analyzing...';
      const res = await fetch('/optimize', {
        method: 'POST',
        headers: {'Content-Type': 'application/json'},
        body: JSON.stringify({dockerfile})
      });
      const data = await res.json();
      document.getElementById('output').textContent = data.analysis;
    }
  </script>
</body>
</html>
"""
 
@app.route('/')
def index():
    return render_template_string(HTML)
 
@app.route('/optimize', methods=['POST'])
def api_optimize():
    data = request.json
    result = optimize_dockerfile(data['dockerfile'])
    return jsonify(result)
 
if __name__ == '__main__':
    app.run(debug=False, port=5000)

Add It to Your CI Pipeline

yaml
# .github/workflows/dockerfile-check.yml
name: Dockerfile Optimization Check
 
on:
  pull_request:
    paths:
      - '**/Dockerfile*'
 
jobs:
  optimize:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - uses: actions/setup-python@v5
        with:
          python-version: '3.12'
      
      - run: pip install anthropic
      
      - name: Analyze changed Dockerfiles
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          git diff --name-only origin/main...HEAD | grep -E 'Dockerfile' | while read f; do
            echo "Analyzing: $f"
            python dockerfile_optimizer.py "$f" > analysis_output.txt 2>&1
            cat analysis_output.txt
          done
      
      - name: Comment on PR
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const output = fs.readFileSync('analysis_output.txt', 'utf8');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `## Dockerfile Analysis\n\`\`\`\n${output}\n\`\`\``
            });

This tool is genuinely useful for code reviews — it catches the subtle caching mistakes and security issues that humans miss on quick PR reviews.

For Docker and containerization hands-on labs, KodeKloud has courses from Docker basics to multi-stage builds and container security.

Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments