Claude Code and Playwright E2E Testing: Practical Production Guide

Asking Claude Code to “add Playwright tests” is not enough for a production site. You may get tests that pass once, but depend on brittle CSS selectors, log in through the UI every time, ignore mobile screenshots, and leave no trace when CI fails.

This guide treats Claude Code as a test-design partner, not just a code generator. The goal is a small Playwright suite that protects revenue paths, authenticated flows, mobile layout, code block rendering, and CI diagnostics without becoming a flaky maintenance burden.

Use the official Claude Code overview, Claude Code common workflows, and Playwright docs for locators, authentication, screenshots, Trace Viewer, retries, and CI as the baseline. For adjacent ClaudeCodeLab guides, pair this with testing strategy, CI/CD setup, and responsive design.

Pick The Flows Worth Protecting

E2E tests are slower than unit tests, so they should protect behavior that only a real browser can prove. Start with three concrete use cases:

Use case	What it protects	Playwright proof
Article to product page	Readers can move from content to `/products/`	CTA link, URL, mobile tap target
Signed-in dashboard	Authenticated users can reach protected pages	`storageState`, redirects, role-specific access
Code article layout	Code blocks and tables do not break mobile pages	Mobile screenshot, no horizontal overflow, trace

This pattern works for ClaudeCodeLab, SaaS dashboards, and ecommerce apps. If a CTA is hidden on mobile, a code block stretches the article, or an authenticated purchase page is never tested in CI, the bug is small technically but expensive commercially.

flowchart LR
  A["Revenue or signup path"] --> B["Playwright E2E"]
  C["Mobile layout risk"] --> B
  D["Pure validation"] --> E["Unit tests"]
  F["API or component boundary"] --> G["Integration tests"]

Prompt Claude Code With Boundaries

Give Claude Code the target routes, selector rules, validation command, and allowed files. That reduces broad rewrites and makes the diff reviewable.

Read the existing Astro site and add Playwright E2E tests.

Goals:
- Verify that `/en/blog/claude-code-playwright-testing/` links to `/products/` and `/training/`
- Check a 390px mobile viewport for article, table, and code block overflow
- Use `storageState` for authenticated tests instead of logging in through the UI every time
- Use 2 retries on CI and `trace: "on-first-retry"`

Constraints:
- Do not use `page.waitForTimeout()`
- Prefer role, label, text, or test id locators over CSS class chains
- Only change `playwright.config.ts` and `tests/e2e/**`
- Run `npx playwright test` and explain failures with Trace Viewer evidence

The useful review question is not “did Claude Code create files?” It is “will this fail for a real user problem, and will the failure be explainable?”

Copy-Paste Setup

The following config is intentionally complete. Change BASE_URL and the preview command for your framework.

cd site
npm i -D @playwright/test
npx playwright install
mkdir tests/e2e

// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';

const baseURL = process.env.BASE_URL ?? 'http://127.0.0.1:4321';
const hasAuth = Boolean(process.env.TEST_EMAIL && process.env.TEST_PASSWORD);
const authFile = 'playwright/.auth/user.json';

export default defineConfig({
  testDir: './tests/e2e',
  timeout: 30_000,
  expect: { timeout: 5_000 },
  fullyParallel: true,
  forbidOnly: Boolean(process.env.CI),
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 2 : undefined,
  reporter: process.env.CI ? [['html'], ['github']] : 'html',
  use: {
    baseURL,
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
    video: 'retain-on-failure',
  },
  ...(process.env.PLAYWRIGHT_WEB_SERVER === '1'
    ? {
        webServer: {
          command: 'npm run preview -- --host 127.0.0.1 --port 4321',
          url: baseURL,
          reuseExistingServer: !process.env.CI,
          timeout: 120_000,
        },
      }
    : {}),
  projects: [
    ...(hasAuth
      ? [
          {
            name: 'setup',
            testMatch: /.*\.setup\.ts/,
          },
        ]
      : []),
    {
      name: 'desktop-chrome',
      use: {
        ...devices['Desktop Chrome'],
        storageState: hasAuth ? authFile : undefined,
      },
      dependencies: hasAuth ? ['setup'] : [],
    },
    {
      name: 'mobile-safari',
      use: {
        ...devices['iPhone 13'],
        storageState: hasAuth ? authFile : undefined,
      },
      dependencies: hasAuth ? ['setup'] : [],
    },
  ],
});

Local retries are off so failures stay visible during development. CI retries are on because shared runners can be noisy. The important part is that retry data is paired with traces, screenshots, and reports.

Save Authentication State

Authenticated E2E tests become slow and flaky when every test logs in through the UI. Use a setup project to save browser state once, then reuse it. Never commit playwright/.auth; it can contain cookies and headers that impersonate a test user.

// tests/e2e/auth.setup.ts
import { test as setup, expect } from '@playwright/test';
import fs from 'node:fs';
import path from 'node:path';

const authFile = path.resolve('playwright/.auth/user.json');
const email = process.env.TEST_EMAIL;
const password = process.env.TEST_PASSWORD;

setup('save signed-in browser state', async ({ page }) => {
  setup.skip(!email || !password, 'Set TEST_EMAIL and TEST_PASSWORD to record auth state.');

  await page.goto('/login');
  await page.getByLabel(/email|メール|e-mail/i).fill(email!);
  await page.getByLabel(/password|パスワード/i).fill(password!);
  await page.getByRole('button', { name: /log in|sign in|ログイン/i }).click();

  await expect(page).toHaveURL(/dashboard|account|admin/);
  await expect(page.locator('body')).toBeVisible();

  fs.mkdirSync(path.dirname(authFile), { recursive: true });
  await page.context().storageState({ path: authFile });
});

Ask Claude Code to separate “login behavior” tests from “authenticated feature” tests. That prevents one login selector change from breaking the whole E2E suite.

Test Mobile Screenshots And Code Blocks

Technical articles fail in boring ways: long code lines widen the page, tables escape the viewport, and CTA links become hard to tap. This test checks the article route, captures a mobile screenshot, and fails on horizontal overflow.

// tests/e2e/article-quality.spec.ts
import { test, expect } from '@playwright/test';

const articlePath = process.env.ARTICLE_PATH ?? '/en/blog/claude-code-playwright-testing/';

test.describe('article quality checks', () => {
  test('article has monetization CTAs', async ({ page }) => {
    await page.goto(articlePath);

    await expect(page.getByRole('heading', { level: 1 })).toContainText(/Playwright|E2E|Claude Code/i);
    await expect(page.locator('a[href="/products/"], a[href="/products"]').first()).toBeVisible();
    await expect(page.locator('a[href="/training/"], a[href="/training"]').first()).toBeVisible();
  });

  test('mobile layout has no horizontal overflow', async ({ page }, testInfo) => {
    await page.setViewportSize({ width: 390, height: 844 });
    await page.goto(articlePath);
    await expect(page.locator('main, article').first()).toBeVisible();

    const overflow = await page.evaluate(() => ({
      viewport: window.innerWidth,
      documentWidth: document.documentElement.scrollWidth,
      offenders: Array.from(document.querySelectorAll('pre, table, img, iframe, .prose'))
        .filter((node) => {
          const rect = node.getBoundingClientRect();
          return rect.left < -1 || rect.right > window.innerWidth + 1;
        })
        .map((node) => {
          const rect = node.getBoundingClientRect();
          return `${node.tagName.toLowerCase()} ${Math.round(rect.left)}-${Math.round(rect.right)}`;
        }),
    }));

    expect(overflow.documentWidth, JSON.stringify(overflow)).toBeLessThanOrEqual(overflow.viewport + 2);
    expect(overflow.offenders).toEqual([]);
    await page.screenshot({ path: testInfo.outputPath('article-mobile.png'), fullPage: true });
  });

  test('code examples are present and copyable', async ({ page }) => {
    await page.goto(articlePath);

    const blocks = page.locator('pre code');
    await expect(blocks.first()).toBeVisible();
    expect(await blocks.count()).toBeGreaterThanOrEqual(3);
    await expect(blocks.nth(0)).toContainText(/playwright|defineConfig|test/i);
  });
});

The screenshot helps reviewers, while the numeric overflow assertion makes CI useful. Do not rely on visual snapshots alone unless the page is deterministic.

Protect Revenue And Learning CTAs

For a monetized content site, the article is not the end of the journey. Readers should be able to continue to templates, products, training, or consultation.

// tests/e2e/revenue-flows.spec.ts
import { test, expect } from '@playwright/test';

const articlePath = process.env.ARTICLE_PATH ?? '/en/blog/claude-code-playwright-testing/';

test.describe('revenue and learning flows', () => {
  test('reader can move from article to products', async ({ page }) => {
    await page.goto(articlePath);

    await page.locator('a[href="/products/"], a[href="/products"]').first().click();
    await expect(page).toHaveURL(/\/products\/?$/);
    await expect(page.locator('main').first()).toBeVisible();
  });

  test('training CTA is reachable on mobile', async ({ page }) => {
    await page.setViewportSize({ width: 390, height: 844 });
    await page.goto(articlePath);

    await page.locator('a[href="/training/"], a[href="/training"]').first().click();
    await expect(page).toHaveURL(/\/training\/?$/);
    await expect(page.locator('main').first()).toBeVisible();
  });

  test('main navigation can open the blog index', async ({ page }) => {
    await page.goto('/');

    await expect(page.getByRole('navigation').first()).toBeVisible();
    await page.getByRole('link', { name: /blog|記事|articles/i }).first().click();
    await expect(page).toHaveURL(/blog/);
  });
});

When Claude Code revises these tests, ask it where a data-testid is justified. Stable IDs are useful for checkout, logout, drag-and-drop, and translated CTAs. They are not a replacement for accessible names everywhere.

Avoid Flaky Selectors

Prefer locators that match user intent:

Priority	Selector	Example	Why
High	Role and accessible name	`page.getByRole('button', { name: /save/i })`	Matches what assistive tech sees
High	Label	`page.getByLabel(/email/i)`	Verifies form semantics
Medium	Text	`page.getByText(/Start trial/)`	Clear but affected by copy changes
Medium	Test id	`page.getByTestId('checkout-submit')`	Good for stable business actions
Low	CSS structure	`.card:nth-child(3)`	Breaks when layout changes

Also avoid page.waitForTimeout(). Use web-first assertions such as toBeVisible(), toHaveURL(), and toContainText(). If the test needs data, create the data through an API or fixture instead of waiting longer.

Debug With Trace Viewer

Trace Viewer is the difference between “CI is red” and “the mobile menu covered the product CTA after hydration.” Configure trace: 'on-first-retry' so CI keeps evidence only when it matters.

npx playwright test --trace on
npx playwright show-report
npx playwright show-trace test-results/path-to-trace/trace.zip

When giving a failure back to Claude Code, include the failing test name, the visible state from the trace, and the expected user behavior. That steers the fix toward better selectors, missing data setup, or a real UI bug instead of a blind timeout increase.

Run It In CI

# .github/workflows/playwright.yml
name: Playwright E2E

on:
  pull_request:
  push:
    branches: [main]

jobs:
  e2e:
    runs-on: ubuntu-latest
    timeout-minutes: 15
    defaults:
      run:
        working-directory: site
    env:
      BASE_URL: http://127.0.0.1:4321
      PLAYWRIGHT_WEB_SERVER: "1"
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm
          cache-dependency-path: site/package-lock.json
      - run: npm ci
      - run: npx playwright install --with-deps
      - run: npm run build
      - run: npx playwright test
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: playwright-report
          path: site/playwright-report
          retention-days: 7

Retries do not make a flaky test healthy. They classify it. Review the HTML report, trace zip, and screenshot artifacts until the root cause is clear.

Practical Pitfalls

The first pitfall is testing everything through E2E. Keep calculation, validation, and permission edge cases in unit or integration tests. Use Playwright for paths where browser behavior matters.

The second pitfall is hiding instability with retries. A test that passes only on retry still needs investigation.

The third pitfall is using production accounts or committing auth state. Use dedicated test users with limited privileges and keep playwright/.auth out of git.

The fourth pitfall is asking Claude Code for broad fixes. A good workflow is: add the failing test, run it, inspect the trace, then make the smallest product change that passes.

For self-serve implementation, start with the ClaudeCodeLab products templates. For teams that need shared review rules, CI policy, and onboarding, the training path is the safer investment.

I tested this workflow against a local ClaudeCodeLab-style article page by checking a 390px screenshot, code block overflow, /products/ and /training/ CTA navigation, and CI retry configuration. The first useful failures were not in Playwright itself; they were long code lines and ambiguous link names. Once those were fixed, traces became a practical handoff artifact for the next Claude Code prompt.

Claude Code and Playwright E2E Testing: Practical Production Guide

Pick The Flows Worth Protecting

Prompt Claude Code With Boundaries

Copy-Paste Setup

Save Authentication State

Test Mobile Screenshots And Code Blocks

Protect Revenue And Learning CTAs

Avoid Flaky Selectors

Debug With Trace Viewer

Run It In CI

Practical Pitfalls

Free PDF: Claude Code Cheatsheet

Level up your Claude Code workflow

Related Posts

Claude Code Permission Receipt Pattern: Record Scope, Proof, and Rollback

Safe Agent Harness Design for Claude Code and Codex: Permissions, Checks, and Rollback

Claude Code Subagents: A Practical Guide to Safe Agent Delegation

Related Products

50 Battle-Tested Claude Code Prompt Templates

The Complete Claude Code Setup & Configuration Guide