Kubernetes CronJob Running Duplicate or Concurrent Jobs: How to Fix It

Kubernetes CronJob running the same job multiple times? Getting duplicate executions or jobs running concurrently when they shouldn't? Here are the fixes.

Duplicate CronJob executions are one of the more insidious Kubernetes bugs — your job runs twice, data gets processed twice, and errors cascade. Here's how to diagnose and fix it.

Why CronJobs Duplicate

Reason 1: `concurrencyPolicy` Allows It (Default Behavior)

By default, concurrencyPolicy: Allow means if a job hasn't finished when the next schedule fires, both run simultaneously.

If your daily backup job takes 2 hours and the schedule is 0 2 * * * (2 AM), it finishes at 4 AM. But the next day at 2 AM, a new job starts while potentially the previous day's slow run is still running from a controller restart.

Reason 2: Controller Restarts

When the Kubernetes controller-manager restarts, it can re-evaluate missed or in-progress jobs and fire them again.

Reason 3: Multiple Scheduler Instances

In HA clusters with multiple controller-manager instances, race conditions can cause duplicate job creation.

Fix 1: Set `concurrencyPolicy: Forbid`

This prevents a new job from starting if the previous one hasn't finished:

yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-backup
spec:
  schedule: "0 2 * * *"
  concurrencyPolicy: Forbid    # ← key fix
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
            - name: backup
              image: backup-tool:latest

With Forbid, if the 2 AM job is still running at 2 AM the next day, the new trigger is skipped (not queued — skipped entirely).

If you want the new job to wait until the old one finishes instead of being skipped:

yaml

  concurrencyPolicy: Replace  # kills the running job and starts a new one

The three options:

Allow (default) — multiple concurrent executions allowed
Forbid — skip new execution if previous is still running
Replace — cancel the running job and start a new one

Fix 2: Add Idempotency to Your Job

Even with Forbid, controller restarts can cause double execution. Make your job idempotent so running it twice has the same effect as once.

python

# Python example: check if already processed before doing work
import redis
import os
 
redis_client = redis.Redis(host="redis")
 
def main():
    job_date = os.environ.get("JOB_DATE", datetime.now().strftime("%Y-%m-%d"))
    lock_key = f"daily-backup:lock:{job_date}"
    
    # Try to acquire lock (expires in 24 hours)
    acquired = redis_client.set(lock_key, "running", nx=True, ex=86400)
    
    if not acquired:
        print(f"Job for {job_date} already running or completed. Skipping.")
        return
    
    try:
        run_backup(job_date)
        redis_client.set(lock_key, "completed", ex=86400)
    except Exception as e:
        redis_client.delete(lock_key)  # allow retry on failure
        raise

Pass the date as an environment variable:

yaml

jobTemplate:
  spec:
    template:
      spec:
        containers:
          - name: backup
            image: backup-tool:latest
            env:
              - name: JOB_DATE
                value: "$(date +%Y-%m-%d)"  # set at schedule time

Fix 3: Check for Stuck/Zombie Jobs

Sometimes jobs appear "running" but the pods are gone. Kubernetes still counts them:

bash

# List all running Jobs from this CronJob
kubectl get jobs -l app=daily-backup -n production
 
# Check if pods are actually running
kubectl get pods -l job-name=daily-backup-12345 -n production
 
# Delete a stuck job manually (won't affect the CronJob schedule)
kubectl delete job daily-backup-12345 -n production

Fix 4: `startingDeadlineSeconds`

If your cluster was down or the controller restarted and missed several schedules, Kubernetes may try to catch up and run multiple missed jobs. Control this with startingDeadlineSeconds:

yaml

spec:
  schedule: "0 2 * * *"
  startingDeadlineSeconds: 3600   # only allow starting within 1 hour of schedule time
  concurrencyPolicy: Forbid

Without startingDeadlineSeconds, if your cluster was down for 3 days and comes back up, Kubernetes will try to run 3 missed daily jobs in rapid succession. With startingDeadlineSeconds: 3600, a missed job is skipped if more than 1 hour has passed since its scheduled time.

Important: If you set startingDeadlineSeconds to a value smaller than your typical start-up time, jobs will never run. Use a value slightly larger than schedule interval for most cases.

Fix 5: Add Job-Level Uniqueness with Labels

If you're seeing two identical jobs from the same CronJob trigger (rare but possible in some cluster configurations), use a unique label per execution:

yaml

jobTemplate:
  metadata:
    labels:
      cronjob-name: daily-backup
  spec:
    # activeDeadlineSeconds kills the job if it runs too long
    activeDeadlineSeconds: 7200  # 2 hours max
    template:
      spec:
        restartPolicy: OnFailure
        containers:
          - name: backup
            image: backup-tool:latest

Debugging Current State

bash

# See recent job history
kubectl get jobs -l app=daily-backup --sort-by=.metadata.creationTimestamp
 
# Check if CronJob is firing on schedule
kubectl describe cronjob daily-backup | grep -A 20 "Events"
 
# See the last schedule time and next schedule
kubectl get cronjob daily-backup -o jsonpath='{.status.lastScheduleTime}'
kubectl get cronjob daily-backup -o jsonpath='{.spec.schedule}'
 
# Check for currently active jobs
kubectl get cronjob daily-backup -o jsonpath='{.status.active}'

Complete Fixed CronJob Spec

yaml

apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-backup
  namespace: production
spec:
  schedule: "0 2 * * *"
  concurrencyPolicy: Forbid        # prevent concurrent runs
  startingDeadlineSeconds: 3600    # skip if can't start within 1 hour
  successfulJobsHistoryLimit: 7    # keep 1 week of successful job history
  failedJobsHistoryLimit: 3        # keep 3 failed job records
  jobTemplate:
    spec:
      activeDeadlineSeconds: 7200  # kill job if it runs > 2 hours (stuck protection)
      backoffLimit: 2              # retry failed pods up to 2 times
      template:
        spec:
          restartPolicy: OnFailure
          containers:
            - name: backup
              image: backup-tool:v2.1.0
              resources:
                requests:
                  memory: "256Mi"
                  cpu: "100m"
                limits:
                  memory: "512Mi"
                  cpu: "500m"

Monitoring CronJobs

Add Prometheus alerts for CronJob health:

yaml

- alert: CronJobNotRunning
  expr: time() - kube_cronjob_status_last_schedule_time{cronjob="daily-backup"} > 90000
  annotations:
    summary: "CronJob daily-backup hasn't run in 25 hours"
 
- alert: CronJobFailed
  expr: kube_job_status_failed > 0
  for: 5m
  annotations:
    summary: "CronJob {{ $labels.job_name }} has failed pods"

Key takeaway: concurrencyPolicy: Forbid + startingDeadlineSeconds + idempotent job logic = reliable CronJob execution.

Kubernetes CronJob Running Duplicate or Concurrent Jobs: How to Fix It

Why CronJobs Duplicate

Reason 1: `concurrencyPolicy` Allows It (Default Behavior)

Reason 2: Controller Restarts

Reason 3: Multiple Scheduler Instances

Fix 1: Set `concurrencyPolicy: Forbid`

Fix 2: Add Idempotency to Your Job

Fix 3: Check for Stuck/Zombie Jobs

Fix 4: `startingDeadlineSeconds`

Fix 5: Add Job-Level Uniqueness with Labels

Debugging Current State

Complete Fixed CronJob Spec

Monitoring CronJobs

Stay ahead of the curve

Related Articles

ArgoCD App of Apps Not Syncing — Every Fix (2026)

ArgoCD Image Updater Not Syncing — Fix Guide

ArgoCD Resource Hook Failed: How to Debug and Fix It

Comments