Advanced (Updated: 6/3/2026)

Claude Code and Playwright E2E Testing: Practical Production Guide

Use Claude Code and Playwright for E2E, mobile screenshots, auth state, traces, selectors, and CI retries.

Claude Code and Playwright E2E Testing: Practical Production Guide

Asking Claude Code to “add Playwright tests” is not enough for a production site. You may get tests that pass once, but depend on brittle CSS selectors, log in through the UI every time, ignore mobile screenshots, and leave no trace when CI fails.

This guide treats Claude Code as a test-design partner, not just a code generator. The goal is a small Playwright suite that protects revenue paths, authenticated flows, mobile layout, code block rendering, and CI diagnostics without becoming a flaky maintenance burden.

Use the official Claude Code overview, Claude Code common workflows, and Playwright docs for locators, authentication, screenshots, Trace Viewer, retries, and CI as the baseline. For adjacent ClaudeCodeLab guides, pair this with testing strategy, CI/CD setup, and responsive design.

Pick The Flows Worth Protecting

E2E tests are slower than unit tests, so they should protect behavior that only a real browser can prove. Start with three concrete use cases:

Use caseWhat it protectsPlaywright proof
Article to product pageReaders can move from content to /products/CTA link, URL, mobile tap target
Signed-in dashboardAuthenticated users can reach protected pagesstorageState, redirects, role-specific access
Code article layoutCode blocks and tables do not break mobile pagesMobile screenshot, no horizontal overflow, trace

This pattern works for ClaudeCodeLab, SaaS dashboards, and ecommerce apps. If a CTA is hidden on mobile, a code block stretches the article, or an authenticated purchase page is never tested in CI, the bug is small technically but expensive commercially.

flowchart LR
  A["Revenue or signup path"] --> B["Playwright E2E"]
  C["Mobile layout risk"] --> B
  D["Pure validation"] --> E["Unit tests"]
  F["API or component boundary"] --> G["Integration tests"]

Prompt Claude Code With Boundaries

Give Claude Code the target routes, selector rules, validation command, and allowed files. That reduces broad rewrites and makes the diff reviewable.

Read the existing Astro site and add Playwright E2E tests.

Goals:
- Verify that `/en/blog/claude-code-playwright-testing/` links to `/products/` and `/training/`
- Check a 390px mobile viewport for article, table, and code block overflow
- Use `storageState` for authenticated tests instead of logging in through the UI every time
- Use 2 retries on CI and `trace: "on-first-retry"`

Constraints:
- Do not use `page.waitForTimeout()`
- Prefer role, label, text, or test id locators over CSS class chains
- Only change `playwright.config.ts` and `tests/e2e/**`
- Run `npx playwright test` and explain failures with Trace Viewer evidence

The useful review question is not “did Claude Code create files?” It is “will this fail for a real user problem, and will the failure be explainable?”

Copy-Paste Setup

The following config is intentionally complete. Change BASE_URL and the preview command for your framework.

cd site
npm i -D @playwright/test
npx playwright install
mkdir tests/e2e
// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';

const baseURL = process.env.BASE_URL ?? 'http://127.0.0.1:4321';
const hasAuth = Boolean(process.env.TEST_EMAIL && process.env.TEST_PASSWORD);
const authFile = 'playwright/.auth/user.json';

export default defineConfig({
  testDir: './tests/e2e',
  timeout: 30_000,
  expect: { timeout: 5_000 },
  fullyParallel: true,
  forbidOnly: Boolean(process.env.CI),
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 2 : undefined,
  reporter: process.env.CI ? [['html'], ['github']] : 'html',
  use: {
    baseURL,
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
    video: 'retain-on-failure',
  },
  ...(process.env.PLAYWRIGHT_WEB_SERVER === '1'
    ? {
        webServer: {
          command: 'npm run preview -- --host 127.0.0.1 --port 4321',
          url: baseURL,
          reuseExistingServer: !process.env.CI,
          timeout: 120_000,
        },
      }
    : {}),
  projects: [
    ...(hasAuth
      ? [
          {
            name: 'setup',
            testMatch: /.*\.setup\.ts/,
          },
        ]
      : []),
    {
      name: 'desktop-chrome',
      use: {
        ...devices['Desktop Chrome'],
        storageState: hasAuth ? authFile : undefined,
      },
      dependencies: hasAuth ? ['setup'] : [],
    },
    {
      name: 'mobile-safari',
      use: {
        ...devices['iPhone 13'],
        storageState: hasAuth ? authFile : undefined,
      },
      dependencies: hasAuth ? ['setup'] : [],
    },
  ],
});

Local retries are off so failures stay visible during development. CI retries are on because shared runners can be noisy. The important part is that retry data is paired with traces, screenshots, and reports.

Save Authentication State

Authenticated E2E tests become slow and flaky when every test logs in through the UI. Use a setup project to save browser state once, then reuse it. Never commit playwright/.auth; it can contain cookies and headers that impersonate a test user.

// tests/e2e/auth.setup.ts
import { test as setup, expect } from '@playwright/test';
import fs from 'node:fs';
import path from 'node:path';

const authFile = path.resolve('playwright/.auth/user.json');
const email = process.env.TEST_EMAIL;
const password = process.env.TEST_PASSWORD;

setup('save signed-in browser state', async ({ page }) => {
  setup.skip(!email || !password, 'Set TEST_EMAIL and TEST_PASSWORD to record auth state.');

  await page.goto('/login');
  await page.getByLabel(/email|メール|e-mail/i).fill(email!);
  await page.getByLabel(/password|パスワード/i).fill(password!);
  await page.getByRole('button', { name: /log in|sign in|ログイン/i }).click();

  await expect(page).toHaveURL(/dashboard|account|admin/);
  await expect(page.locator('body')).toBeVisible();

  fs.mkdirSync(path.dirname(authFile), { recursive: true });
  await page.context().storageState({ path: authFile });
});

Ask Claude Code to separate “login behavior” tests from “authenticated feature” tests. That prevents one login selector change from breaking the whole E2E suite.

Test Mobile Screenshots And Code Blocks

Technical articles fail in boring ways: long code lines widen the page, tables escape the viewport, and CTA links become hard to tap. This test checks the article route, captures a mobile screenshot, and fails on horizontal overflow.

// tests/e2e/article-quality.spec.ts
import { test, expect } from '@playwright/test';

const articlePath = process.env.ARTICLE_PATH ?? '/en/blog/claude-code-playwright-testing/';

test.describe('article quality checks', () => {
  test('article has monetization CTAs', async ({ page }) => {
    await page.goto(articlePath);

    await expect(page.getByRole('heading', { level: 1 })).toContainText(/Playwright|E2E|Claude Code/i);
    await expect(page.locator('a[href="/products/"], a[href="/products"]').first()).toBeVisible();
    await expect(page.locator('a[href="/training/"], a[href="/training"]').first()).toBeVisible();
  });

  test('mobile layout has no horizontal overflow', async ({ page }, testInfo) => {
    await page.setViewportSize({ width: 390, height: 844 });
    await page.goto(articlePath);
    await expect(page.locator('main, article').first()).toBeVisible();

    const overflow = await page.evaluate(() => ({
      viewport: window.innerWidth,
      documentWidth: document.documentElement.scrollWidth,
      offenders: Array.from(document.querySelectorAll('pre, table, img, iframe, .prose'))
        .filter((node) => {
          const rect = node.getBoundingClientRect();
          return rect.left < -1 || rect.right > window.innerWidth + 1;
        })
        .map((node) => {
          const rect = node.getBoundingClientRect();
          return `${node.tagName.toLowerCase()} ${Math.round(rect.left)}-${Math.round(rect.right)}`;
        }),
    }));

    expect(overflow.documentWidth, JSON.stringify(overflow)).toBeLessThanOrEqual(overflow.viewport + 2);
    expect(overflow.offenders).toEqual([]);
    await page.screenshot({ path: testInfo.outputPath('article-mobile.png'), fullPage: true });
  });

  test('code examples are present and copyable', async ({ page }) => {
    await page.goto(articlePath);

    const blocks = page.locator('pre code');
    await expect(blocks.first()).toBeVisible();
    expect(await blocks.count()).toBeGreaterThanOrEqual(3);
    await expect(blocks.nth(0)).toContainText(/playwright|defineConfig|test/i);
  });
});

The screenshot helps reviewers, while the numeric overflow assertion makes CI useful. Do not rely on visual snapshots alone unless the page is deterministic.

Protect Revenue And Learning CTAs

For a monetized content site, the article is not the end of the journey. Readers should be able to continue to templates, products, training, or consultation.

// tests/e2e/revenue-flows.spec.ts
import { test, expect } from '@playwright/test';

const articlePath = process.env.ARTICLE_PATH ?? '/en/blog/claude-code-playwright-testing/';

test.describe('revenue and learning flows', () => {
  test('reader can move from article to products', async ({ page }) => {
    await page.goto(articlePath);

    await page.locator('a[href="/products/"], a[href="/products"]').first().click();
    await expect(page).toHaveURL(/\/products\/?$/);
    await expect(page.locator('main').first()).toBeVisible();
  });

  test('training CTA is reachable on mobile', async ({ page }) => {
    await page.setViewportSize({ width: 390, height: 844 });
    await page.goto(articlePath);

    await page.locator('a[href="/training/"], a[href="/training"]').first().click();
    await expect(page).toHaveURL(/\/training\/?$/);
    await expect(page.locator('main').first()).toBeVisible();
  });

  test('main navigation can open the blog index', async ({ page }) => {
    await page.goto('/');

    await expect(page.getByRole('navigation').first()).toBeVisible();
    await page.getByRole('link', { name: /blog|記事|articles/i }).first().click();
    await expect(page).toHaveURL(/blog/);
  });
});

When Claude Code revises these tests, ask it where a data-testid is justified. Stable IDs are useful for checkout, logout, drag-and-drop, and translated CTAs. They are not a replacement for accessible names everywhere.

Avoid Flaky Selectors

Prefer locators that match user intent:

PrioritySelectorExampleWhy
HighRole and accessible namepage.getByRole('button', { name: /save/i })Matches what assistive tech sees
HighLabelpage.getByLabel(/email/i)Verifies form semantics
MediumTextpage.getByText(/Start trial/)Clear but affected by copy changes
MediumTest idpage.getByTestId('checkout-submit')Good for stable business actions
LowCSS structure.card:nth-child(3)Breaks when layout changes

Also avoid page.waitForTimeout(). Use web-first assertions such as toBeVisible(), toHaveURL(), and toContainText(). If the test needs data, create the data through an API or fixture instead of waiting longer.

Debug With Trace Viewer

Trace Viewer is the difference between “CI is red” and “the mobile menu covered the product CTA after hydration.” Configure trace: 'on-first-retry' so CI keeps evidence only when it matters.

npx playwright test --trace on
npx playwright show-report
npx playwright show-trace test-results/path-to-trace/trace.zip

When giving a failure back to Claude Code, include the failing test name, the visible state from the trace, and the expected user behavior. That steers the fix toward better selectors, missing data setup, or a real UI bug instead of a blind timeout increase.

Run It In CI

# .github/workflows/playwright.yml
name: Playwright E2E

on:
  pull_request:
  push:
    branches: [main]

jobs:
  e2e:
    runs-on: ubuntu-latest
    timeout-minutes: 15
    defaults:
      run:
        working-directory: site
    env:
      BASE_URL: http://127.0.0.1:4321
      PLAYWRIGHT_WEB_SERVER: "1"
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm
          cache-dependency-path: site/package-lock.json
      - run: npm ci
      - run: npx playwright install --with-deps
      - run: npm run build
      - run: npx playwright test
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: playwright-report
          path: site/playwright-report
          retention-days: 7

Retries do not make a flaky test healthy. They classify it. Review the HTML report, trace zip, and screenshot artifacts until the root cause is clear.

Practical Pitfalls

The first pitfall is testing everything through E2E. Keep calculation, validation, and permission edge cases in unit or integration tests. Use Playwright for paths where browser behavior matters.

The second pitfall is hiding instability with retries. A test that passes only on retry still needs investigation.

The third pitfall is using production accounts or committing auth state. Use dedicated test users with limited privileges and keep playwright/.auth out of git.

The fourth pitfall is asking Claude Code for broad fixes. A good workflow is: add the failing test, run it, inspect the trace, then make the smallest product change that passes.

For self-serve implementation, start with the ClaudeCodeLab products templates. For teams that need shared review rules, CI policy, and onboarding, the training path is the safer investment.

I tested this workflow against a local ClaudeCodeLab-style article page by checking a 390px screenshot, code block overflow, /products/ and /training/ CTA navigation, and CI retry configuration. The first useful failures were not in Playwright itself; they were long code lines and ambiguous link names. Once those were fixed, traces became a practical handoff artifact for the next Claude Code prompt.

#Claude Code #Playwright #E2E testing #test automation #quality assurance
Free

Free PDF: Claude Code Cheatsheet

Enter your email and download the one-page Claude Code cheatsheet for commands, review habits, and safe workflows.

We handle your data with care and never send spam.

Level up your Claude Code workflow

Start with the free PDF, use Gumroad guides when you need repeatable workflows, and book consultation when rollout or revenue paths need human judgment.

Masa

About the Author

Masa

Engineer focused on practical Claude Code workflows. Runs claudecode-lab.com, a 10-language technical media site.