GitLab CI Pipeline Keeps Failing? Here's How to Debug and Fix It
GitLab CI pipelines fail for dozens of reasons. This guide walks through the most common errors — from Docker-in-Docker issues to missing variables — and shows you exactly how to fix them.
Your GitLab CI pipeline is red. The job failed. The logs are cryptic. Your MR is blocked.
Sound familiar?
GitLab CI is powerful, but it fails in surprisingly consistent ways. Once you've debugged a few pipelines, you start recognizing the same patterns over and over. This guide collects the most common failure modes — and gives you a clear fix for each one.
Why GitLab CI Fails (The Real Reasons)
Pipelines don't just fail randomly. They fail because:
- Your
.gitlab-ci.ymlhas a syntax or logic error - The runner doesn't have what your job needs (Docker, env vars, tools)
- Your scripts assume a state that doesn't exist in a clean runner
- Network or registry access is broken
- Resource limits (memory, CPU) are hit silently
Let's go through each one.
1. YAML Syntax Error
The most frustrating failure — because it blocks everything before a single job even starts.
# Wrong — indentation is off
build:
stage: build
script: # <-- should be indented under build
- echo "building"The fix: Use GitLab's built-in CI Lint tool at https://gitlab.com/your-group/your-repo/-/ci/lint. Paste your config and it validates it instantly.
Or run locally:
# Install gitlab-ci-local for local testing
npm install -g gitlab-ci-local
# Run a specific job
gitlab-ci-local --job buildCommon YAML gotchas:
- Tabs vs spaces (YAML requires spaces only)
- Wrong indentation on
script:,rules:, orneeds: - Missing
stages:block when referencing custom stage names
2. Job Never Triggers (Wrong Rules)
You push code and the job simply doesn't appear. No error, no log — just absent.
# This job only runs on main branch
deploy:
stage: deploy
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
script:
- echo "deploying"If you push to a feature branch, this job won't run — and GitLab won't tell you why. It just won't appear.
The fix: Check your rules: or only:/except: blocks carefully.
# Better: explicit visibility
deploy:
stage: deploy
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: on_success
- when: never # explicitly skip all other cases
script:
- ./scripts/deploy.shUse workflow:rules to control when pipelines create at all:
workflow:
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'3. Docker-in-Docker Failing
If your pipeline builds Docker images, you've probably hit this:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock
Or:
error during connect: Post "http://docker:2375/v1.24/auth": dial tcp: ...
This happens when your runner doesn't have Docker-in-Docker (DinD) configured properly.
The fix:
build-image:
stage: build
image: docker:24.0
services:
- docker:24.0-dind # <-- this is required
variables:
DOCKER_TLS_CERTDIR: "/certs"
DOCKER_HOST: tcp://docker:2376
DOCKER_TLS_VERIFY: 1
DOCKER_CERT_PATH: "$DOCKER_TLS_CERTDIR/client"
script:
- docker info
- docker build -t my-app:$CI_COMMIT_SHORT_SHA .
- docker push my-app:$CI_COMMIT_SHORT_SHAMake sure your runner's config.toml has privileged = true:
[[runners.docker]]
privileged = true
volumes = ["/certs/client", "/cache"]4. Missing Environment Variables
Jobs fail silently when an expected variable isn't set:
Error: DOCKER_PASSWORD is not set
Or worse — the job runs but uses an empty string and corrupts something downstream.
The fix: Define variables at the right scope.
In GitLab UI → Settings → CI/CD → Variables:
- Set
DOCKER_USERNAME,DOCKER_PASSWORD,KUBECONFIG, etc. - Mark secrets as Masked and Protected (protected vars only available on protected branches)
Then reference them safely in your pipeline:
variables:
REGISTRY: "registry.gitlab.com/mygroup/myapp"
push:
stage: push
script:
- echo "$DOCKER_PASSWORD" | docker login -u "$DOCKER_USERNAME" --password-stdin registry.gitlab.com
- docker push $REGISTRY:$CI_COMMIT_SHORT_SHAPro tip: Test variable presence in your script:
#!/bin/bash
set -e
: "${KUBECONFIG:?KUBECONFIG is required but not set}"
: "${DOCKER_PASSWORD:?DOCKER_PASSWORD is required but not set}"5. Artifacts Not Found in Later Stages
No files or directories matching the artifact pattern were found
A job in stage 2 can't find files produced in stage 1.
# Wrong — artifacts not defined
build:
stage: build
script:
- go build -o bin/app .
deploy:
stage: deploy
script:
- ./bin/app # This file doesn't exist here!The fix: Declare artifacts explicitly:
build:
stage: build
script:
- go build -o bin/app .
artifacts:
paths:
- bin/app
expire_in: 1 hour
deploy:
stage: deploy
needs:
- job: build
artifacts: true # explicitly pull artifacts
script:
- chmod +x bin/app
- ./bin/appUse needs: instead of dependencies: for better DAG pipelines.
6. Runner Exits With Code 1 (Script Error)
Job failed: exit code 1
This is the most generic error. Your script returned a non-zero exit code.
The fix: Add set -e (exit on error) and set -x (print commands) to your scripts for better visibility:
test:
stage: test
script:
- set -ex
- npm ci
- npm test
- echo "Tests passed"Common culprits:
npm testfails but error is swallowed- A command returns non-zero but you expected it to succeed
|| truemasking a real failure you care about
7. Cache Not Working (Slow Pipelines)
Your pipeline re-downloads 500MB of node_modules every single run.
# Wrong — no cache key, stale cache
cache:
paths:
- node_modules/The fix: Use a proper cache key tied to your dependency file:
cache:
key:
files:
- package-lock.json
paths:
- .npm/
policy: pull-push
install:
stage: install
script:
- npm ci --cache .npm --prefer-offlineFor Python:
cache:
key:
files:
- requirements.txt
paths:
- .pip-cache/
install:
script:
- pip install --cache-dir=.pip-cache -r requirements.txtQuick Debugging Checklist
When a pipeline fails, go through this in order:
- Check the job log — read the full output, not just the last line
- Validate YAML — use GitLab's CI Lint tool
- Check variables — are all required env vars set for that branch/environment?
- Check runner tags — is the job assigned to a runner that exists?
- Check
rules:— is the job supposed to run for this trigger? - Check artifacts — did the previous stage produce what this stage expects?
- Add
set -x— run with verbose shell output to trace the exact failure
Useful GitLab CI Debug Commands
# Validate your .gitlab-ci.yml without pushing
curl --header "PRIVATE-TOKEN: <your_token>" \
--header "Content-Type: application/json" \
--data @- \
"https://gitlab.com/api/v4/ci/lint" < <(jq -Rs '{content: .}' < .gitlab-ci.yml)
# Trigger a pipeline manually via API
curl -X POST \
--form "token=<trigger_token>" \
--form "ref=main" \
"https://gitlab.com/api/v4/projects/123/trigger/pipeline"Learn GitLab CI in Depth
If you want to master GitLab CI/CD from basics to advanced — including runners, environments, and production deployments — KodeKloud's DevOps courses are the best structured path available. Hands-on labs, real pipelines, no fluff.
Summary
| Problem | Fix |
|---|---|
| YAML error | Use GitLab CI Lint tool |
| Job not triggering | Check rules: conditions |
| Docker-in-Docker fails | Add docker:dind service + privileged = true |
| Missing env vars | Set in GitLab UI, use : "${VAR:?}" guard |
| Artifacts missing | Declare artifacts.paths + use needs.artifacts: true |
| Exit code 1 | Add set -ex to debug |
| Slow cache | Use cache.key.files with lockfile |
GitLab CI failures are almost always one of these seven things. Know the pattern, fix it fast, and get back to shipping.
Stay ahead of the curve
Get the latest DevOps, Kubernetes, AWS, and AI/ML guides delivered straight to your inbox. No spam — just practical engineering content.
Related Articles
Ansible vs Terraform: Which One Should You Use? (2026)
Ansible and Terraform are both called 'IaC tools' but they solve completely different problems. Here's when to use each — and when to use both.
CI/CD Pipeline Is Broken: How to Debug and Fix GitHub Actions, Jenkins & ArgoCD Failures (2026)
Your CI/CD pipeline failed and you don't know why. This complete debugging guide covers GitHub Actions, Jenkins, and ArgoCD failures with real error messages and step-by-step fixes.
GitHub Actions 'No Space Left on Device': How to Fix Runner Disk Issues
GitHub Actions failing with 'no space left on device'? Here's how to free disk space on runners, optimize Docker builds, and handle large monorepos.