Skills MCP Learn Benchmarks Tools News
Learn · Guides · DevOps

CI/CD for Agent-Written Code.

Agents ship fast. CI is how you ship without shipping slopsquatted packages, leaked keys, and tests that only pass because the agent wrote both sides of the assertion.

SPONSOR

AppSignal — Know when CI green is a lie. AppSignal catches runtime failures preview deploys miss — before they reach production.

↗
On this page
  1. Minimum viable workflow
  2. Secret & dependency gates
  3. Lint / type / test matrix
  4. Preview environments
  5. Agent PR review checklist
  6. When to block merge
  7. Live: workflow builder
  8. Pitfalls & bad advice
CH 01

Minimum viable workflow for AI-era repos.

Pre-agent CI assumed a human wrote the diff, chose the dependencies, and ran the tests locally before opening a PR. That assumption is gone. Your pipeline now has three jobs: catch what agents get wrong automatically, show humans what changed in a runnable preview, and give reviewers a short checklist for what automation cannot judge.

The minimum viable shape for a web repo in June 2026:

  1. PR opened — security job runs first (lockfile install, audit, gitleaks, dependency review).
  2. Same PR — lint, typecheck, unit/integration tests, and build run in parallel once deps are verified.
  3. Same PR — preview deploy posts a URL (Vercel, Netlify, or host-native preview).
  4. Human review — checklist pass on dependency diffs, auth boundaries, and test contracts.
  5. Merge to main — production deploy; no new gates you did not already run on the PR.

Branch protection should require the security and quality jobs — not optional status checks agents can dismiss. Pair this with the deploy discipline in Guide 01 and the test strategy in Guide 09. CI catches regressions; it does not replace either guide.

Stage Runs on Blocks merge?
Security gates Every PR Yes
Lint / type / test / build Every PR Yes
Preview deploy Every PR (non-draft) No — but required before human approval
E2E smoke (Playwright) PRs touching UI routes Yes for user-facing changes
Production deploy Push to main N/A — gated by PR checks above
CH 02

Secret & dependency gates.

These five steps are the floor from Guide 12. Run them in a dedicated security job before anything else touches node_modules. Agents add packages and paste secrets into config files with equal confidence.

.github/workflows/ci.yml — security job
jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # 1. Lockfile-strict install. No new packages. No postinstall scripts.
      - run: npm ci --ignore-scripts
      # pnpm equivalent: pnpm install --frozen-lockfile --ignore-scripts

      # 2. Known-vuln scan (npm audit / pnpm audit / yarn npm audit).
      - run: npm audit --audit-level=high
      # pnpm: pnpm audit --audit-level=high

      # 3. Secret scan on the whole diff + history reachable from the PR.
      - uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

      # 4. Dependency review: metadata on added/changed packages.
      - uses: actions/dependency-review-action@v4
        with:
          fail-on-severity: high

      # 5. Optional SAST — same classes agents produce (XSS, SQLi, hardcoded creds).
      - uses: github/codeql-action/init@v3
        with:
          languages: javascript-typescript
      - uses: github/codeql-action/analyze@v3

Why --ignore-scripts matters: slopsquatting payloads execute in postinstall hooks. Lockfile-strict install stops new packages from entering CI; ignoring scripts stops hooks on packages already in the lockfile from running during the install step.

GitHub's platform-side secret scanning for AI coding agents catches credentials before commit when agents use the GitHub MCP server. That is complementary to gitleaks in CI — MCP scanning is pre-commit; gitleaks is the enforced gate on every PR. You want both layers.

Do not skip dependency review on agent PRs. External agents (Copilot coding agent, Claude, Codex) can open PRs with new dependencies. dependency-review-action is often the first signal that a hallucinated package name actually resolved to a registry entry with no provenance.

CH 03

Lint / type / test matrix.

One quality job per stack shape. Agents produce syntactically valid code that fails typecheck, skips edge cases in tests, and builds locally only because they never ran pnpm build in CI conditions. Run the same commands in CI that your AGENTS.md tells the agent to run before opening a PR.

Stack Lint Type Test Build
Next.js App Router pnpm lint pnpm exec tsc --noEmit pnpm test pnpm build
Astro pnpm lint (if configured) pnpm exec astro check pnpm test pnpm build
Static HTML / Vite SPA pnpm exec eslint . — (JS only) or tsc pnpm test pnpm build
quality job pattern
  quality:
    runs-on: ubuntu-latest
    needs: security
    steps:
      - uses: actions/checkout@v4
      - uses: pnpm/action-setup@v4
        with:
          version: 9
      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: pnpm
      - run: pnpm install --frozen-lockfile --ignore-scripts
      - run: pnpm run lint
      - run: pnpm exec tsc --noEmit
      - run: pnpm test -- --run
      - run: pnpm build

Split e2e into a separate job when Playwright is in the stack. Give it the preview URL as an env var so tests hit the deployed artifact, not localhost. See Guide 09 for which tests agents should write vs which you should specify first.

Running npm install in CI

Agents open PRs that add a dependency and update package.json without updating the lockfile. npm install in CI silently resolves new versions and hides the problem. Always use lockfile-strict installs; fail the job if the lockfile is out of sync.

CH 04

Preview environments.

Green CI on a diff you cannot click is worthless for web work. Preview deploys turn every PR into something a human (and a reviewer agent) can actually use. Vercel and Netlify both wire this up from GitHub with minimal config.

Host Trigger PR comment Best for
Vercel GitHub App on PR Automatic preview URL Next.js, full-stack React
Netlify GitHub App on PR Deploy preview link Astro, static, Jamstack
Cloudflare Pages GitHub integration Preview alias per branch Static + Workers

Preview envs should use staging credentials only — never production API keys. Mirror the secret discipline from Guide 01: separate Vercel/Netlify env scopes for Preview vs Production, and block agent access to production env vars in the host dashboard.

Post the preview URL in the PR description template. Reviewers should load it before approving. If the agent changed auth, forms, or client-side routing, the preview is where broken redirects show up that unit tests miss.

CH 05

Agent PR review checklist.

CI catches mechanical failures. Humans (and targeted agent assists) catch intent failures. This is not "ask the agent if the PR looks good" — generic agent review is optimistic and misses supply-chain context. Use tool-specific security features where they exist, then apply this checklist yourself.

GitHub Copilot CLI has a dedicated /security-review slash command for on-demand vulnerability scans in the terminal — see our Copilot CLI security review coverage. That is a pre-push habit, not a merge gate. Claude Code, Codex CLI, and Cursor have no equivalent built-in security mode; run a read-only audit via prompt or the Security Review skill, then verify findings yourself.

Check Why automation misses it Human verifies
Lockfile matches package.json CI fails if out of sync — but reviewer must read what was added Every new dependency: name, maintainer, download count, age
No secrets in diff gitleaks catches patterns; not context ("is this a real key?") Rotate anything that looks live; check git history
Auth boundary unchanged or intentionally updated SAST does not know your threat model Middleware, RLS policies, session cookies
Tests assert behavior, not implementation Tests pass either way Read new tests before trusting green CI
Preview matches acceptance criteria Build success ≠ correct UI Click through the preview URL
.github/workflows changes Agents edit CI to skip gates Two human approvals; never agent-only

For agent-opened PRs from GitHub Copilot, Claude, or Codex, treat the PR body as untrusted. Read the diff, not the summary. The GitHub MCP secret scanning work helps agents avoid committing keys — it does not absolve you from reading dependency diffs.

CH 06

When to block merge.

Branch protection exists so nobody merges around a red check at 5pm on a Friday. In agent-heavy repos, the rules need to be explicit — agents do not feel guilt about force-merging.

Condition Action
gitleaks or secret scan hit Block — rotate credential, rewrite history if needed
High-severity npm audit / pnpm audit Block — patch or pin before merge
dependency-review-action flags new high-risk package Block — human approves with documented reason
Lint, typecheck, test, or build failure Block — no exceptions for "agent will fix follow-up"
Lockfile or workflow change without two human reviewers Block by policy
Medium-severity audit, triaged CodeQL alert Warn or ticket — block only if exploitable path exists
Preview deploy failed but CI green Block human approval until preview works or change is docs-only

Configure GitHub branch protection to require status checks from security and quality (and e2e if present). Disable admin bypass for agents' service accounts. The goal is not zero red builds ever — it is zero merges that skip a gate someone consciously disabled.

DEMO · INTERACTIVE

Live: workflow builder.

Pick your stack and options. Copy the generated .github/workflows/ci.yml into your repo. Runs entirely in your browser.

CI workflow builder In-browser only · No network calls
Optional jobs
.github/workflows/ci.yml
3 jobs

                                    
PITFALLS

Pitfalls & bad advice.

"Copilot /security-review passed, ship it"

/security-review is Copilot CLI's on-demand terminal scan — useful, not enforced, and not a substitute for gitleaks, dependency review, or human review of lockfile diffs. Other agents have no equivalent built-in mode.

Letting the agent edit branch protection or workflows

An agent told to "fix CI" will disable the check that failed. Deny writes to .github/ in agent permissions; require two human reviewers on workflow changes.

Skipping preview deploys for "small" UI changes

Agents rename props, break responsive layouts, and drop aria labels in three-line diffs. If the PR touches JSX or CSS, load the preview.

Trusting agent-written test coverage numbers

100% line coverage on tests the agent wrote against its own code is decorative. Read the assertions. Block merge if they pin implementation details.

What to read next.

  • Guide How to Actually Deploy Your Vibe-Coded App Preview vs production envs, secrets, and the host matrix this pipeline deploys through.
  • Guide Testing AI-Written Code Contract tests, the AI-era pyramid, and which tests belong in CI.
  • Guide Securing AI-Generated Code Slopsquatting, prompt injection, and the security gates this CI pipeline enforces.
Changelog
  • 2026-06-15Initial publish.
STATUS ● BUILDING THE FUTURE
MISSION LLM RESOURCES
VERSION BETA 3.0

BUILD WITH AI. SHIP WITH CONFIDENCE.

@WEBDEVELOPERHQ ↗
TERMS / PRIVACY
FRIENDS
Authentic Jobs
Authentic Jobs ↗
Web Reference
Web Reference ↗
Ready.dev
Ready.dev ↗
Design.dev
Design.dev ↗
© 2026 WEB DEVELOPER / ALL RIGHTS RESERVED