All Articles

GitLab CI Pipeline Keeps Failing? Here's How to Debug and Fix It

GitLab CI pipelines fail for dozens of reasons. This guide walks through the most common errors — from Docker-in-Docker issues to missing variables — and shows you exactly how to fix them.

DevOpsBoysMar 14, 20265 min read
Share:Tweet

Your GitLab CI pipeline is red. The job failed. The logs are cryptic. Your MR is blocked.

Sound familiar?

GitLab CI is powerful, but it fails in surprisingly consistent ways. Once you've debugged a few pipelines, you start recognizing the same patterns over and over. This guide collects the most common failure modes — and gives you a clear fix for each one.


Why GitLab CI Fails (The Real Reasons)

Pipelines don't just fail randomly. They fail because:

  • Your .gitlab-ci.yml has a syntax or logic error
  • The runner doesn't have what your job needs (Docker, env vars, tools)
  • Your scripts assume a state that doesn't exist in a clean runner
  • Network or registry access is broken
  • Resource limits (memory, CPU) are hit silently

Let's go through each one.


1. YAML Syntax Error

The most frustrating failure — because it blocks everything before a single job even starts.

yaml
# Wrong — indentation is off
build:
  stage: build
script:        # <-- should be indented under build
    - echo "building"

The fix: Use GitLab's built-in CI Lint tool at https://gitlab.com/your-group/your-repo/-/ci/lint. Paste your config and it validates it instantly.

Or run locally:

bash
# Install gitlab-ci-local for local testing
npm install -g gitlab-ci-local
 
# Run a specific job
gitlab-ci-local --job build

Common YAML gotchas:

  • Tabs vs spaces (YAML requires spaces only)
  • Wrong indentation on script:, rules:, or needs:
  • Missing stages: block when referencing custom stage names

2. Job Never Triggers (Wrong Rules)

You push code and the job simply doesn't appear. No error, no log — just absent.

yaml
# This job only runs on main branch
deploy:
  stage: deploy
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'
  script:
    - echo "deploying"

If you push to a feature branch, this job won't run — and GitLab won't tell you why. It just won't appear.

The fix: Check your rules: or only:/except: blocks carefully.

yaml
# Better: explicit visibility
deploy:
  stage: deploy
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'
      when: on_success
    - when: never   # explicitly skip all other cases
  script:
    - ./scripts/deploy.sh

Use workflow:rules to control when pipelines create at all:

yaml
workflow:
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

3. Docker-in-Docker Failing

If your pipeline builds Docker images, you've probably hit this:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock

Or:

error during connect: Post "http://docker:2375/v1.24/auth": dial tcp: ...

This happens when your runner doesn't have Docker-in-Docker (DinD) configured properly.

The fix:

yaml
build-image:
  stage: build
  image: docker:24.0
  services:
    - docker:24.0-dind    # <-- this is required
  variables:
    DOCKER_TLS_CERTDIR: "/certs"
    DOCKER_HOST: tcp://docker:2376
    DOCKER_TLS_VERIFY: 1
    DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
  script:
    - docker info
    - docker build -t my-app:$CI_COMMIT_SHORT_SHA .
    - docker push my-app:$CI_COMMIT_SHORT_SHA

Make sure your runner's config.toml has privileged = true:

toml
[[runners.docker]]
  privileged = true
  volumes = ["/certs/client", "/cache"]

4. Missing Environment Variables

Jobs fail silently when an expected variable isn't set:

Error: DOCKER_PASSWORD is not set

Or worse — the job runs but uses an empty string and corrupts something downstream.

The fix: Define variables at the right scope.

In GitLab UI → Settings → CI/CD → Variables:

  • Set DOCKER_USERNAME, DOCKER_PASSWORD, KUBECONFIG, etc.
  • Mark secrets as Masked and Protected (protected vars only available on protected branches)

Then reference them safely in your pipeline:

yaml
variables:
  REGISTRY: "registry.gitlab.com/mygroup/myapp"
 
push:
  stage: push
  script:
    - echo "$DOCKER_PASSWORD" | docker login -u "$DOCKER_USERNAME" --password-stdin registry.gitlab.com
    - docker push $REGISTRY:$CI_COMMIT_SHORT_SHA

Pro tip: Test variable presence in your script:

bash
#!/bin/bash
set -e
 
: "${KUBECONFIG:?KUBECONFIG is required but not set}"
: "${DOCKER_PASSWORD:?DOCKER_PASSWORD is required but not set}"

5. Artifacts Not Found in Later Stages

No files or directories matching the artifact pattern were found

A job in stage 2 can't find files produced in stage 1.

yaml
# Wrong — artifacts not defined
build:
  stage: build
  script:
    - go build -o bin/app .
 
deploy:
  stage: deploy
  script:
    - ./bin/app    # This file doesn't exist here!

The fix: Declare artifacts explicitly:

yaml
build:
  stage: build
  script:
    - go build -o bin/app .
  artifacts:
    paths:
      - bin/app
    expire_in: 1 hour
 
deploy:
  stage: deploy
  needs:
    - job: build
      artifacts: true    # explicitly pull artifacts
  script:
    - chmod +x bin/app
    - ./bin/app

Use needs: instead of dependencies: for better DAG pipelines.


6. Runner Exits With Code 1 (Script Error)

Job failed: exit code 1

This is the most generic error. Your script returned a non-zero exit code.

The fix: Add set -e (exit on error) and set -x (print commands) to your scripts for better visibility:

yaml
test:
  stage: test
  script:
    - set -ex
    - npm ci
    - npm test
    - echo "Tests passed"

Common culprits:

  • npm test fails but error is swallowed
  • A command returns non-zero but you expected it to succeed
  • || true masking a real failure you care about

7. Cache Not Working (Slow Pipelines)

Your pipeline re-downloads 500MB of node_modules every single run.

yaml
# Wrong — no cache key, stale cache
cache:
  paths:
    - node_modules/

The fix: Use a proper cache key tied to your dependency file:

yaml
cache:
  key:
    files:
      - package-lock.json
  paths:
    - .npm/
  policy: pull-push
 
install:
  stage: install
  script:
    - npm ci --cache .npm --prefer-offline

For Python:

yaml
cache:
  key:
    files:
      - requirements.txt
  paths:
    - .pip-cache/
 
install:
  script:
    - pip install --cache-dir=.pip-cache -r requirements.txt

Quick Debugging Checklist

When a pipeline fails, go through this in order:

  1. Check the job log — read the full output, not just the last line
  2. Validate YAML — use GitLab's CI Lint tool
  3. Check variables — are all required env vars set for that branch/environment?
  4. Check runner tags — is the job assigned to a runner that exists?
  5. Check rules: — is the job supposed to run for this trigger?
  6. Check artifacts — did the previous stage produce what this stage expects?
  7. Add set -x — run with verbose shell output to trace the exact failure

Useful GitLab CI Debug Commands

bash
# Validate your .gitlab-ci.yml without pushing
curl --header "PRIVATE-TOKEN: <your_token>" \
  --header "Content-Type: application/json" \
  --data @- \
  "https://gitlab.com/api/v4/ci/lint" < <(jq -Rs '{content: .}' < .gitlab-ci.yml)
 
# Trigger a pipeline manually via API
curl -X POST \
  --form "token=<trigger_token>" \
  --form "ref=main" \
  "https://gitlab.com/api/v4/projects/123/trigger/pipeline"

Learn GitLab CI in Depth

If you want to master GitLab CI/CD from basics to advanced — including runners, environments, and production deployments — KodeKloud's DevOps courses are the best structured path available. Hands-on labs, real pipelines, no fluff.


Summary

ProblemFix
YAML errorUse GitLab CI Lint tool
Job not triggeringCheck rules: conditions
Docker-in-Docker failsAdd docker:dind service + privileged = true
Missing env varsSet in GitLab UI, use : "${VAR:?}" guard
Artifacts missingDeclare artifacts.paths + use needs.artifacts: true
Exit code 1Add set -ex to debug
Slow cacheUse cache.key.files with lockfile

GitLab CI failures are almost always one of these seven things. Know the pattern, fix it fast, and get back to shipping.

Newsletter

Stay ahead of the curve

Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.

Related Articles

Comments