Fixing Flaky Tests in Playwright: A Step-by-Step Guide with Examples
Playwright has quickly become the go-to browser automation framework for modern web testing. Its architecture -- with auto-waiting, browser contexts for isolation, and built-in network interception -- was specifically designed to reduce test flakiness. Yet flaky tests in Playwright remain one of the most common complaints in QA communities.
The truth is that Playwright gives you excellent tools for writing reliable tests, but it does not force you to use them correctly. This guide walks you through the most common causes of flaky Playwright tests, shows you exactly how to fix each one, and provides production-ready patterns that you can adopt immediately.
Why Playwright Tests Become Flaky
Despite Playwright's anti-flakiness design, tests still become flaky for several reasons.
The Framework Helps, But It Cannot Think for You
Playwright's auto-waiting mechanism waits for elements to be visible, enabled, and stable before interacting with them. This eliminates many common flakiness causes that plague Selenium tests. However, auto-waiting has limits. It waits for the element itself, not for the application state that the element represents.
For example, Playwright will wait for a button to become clickable, but it will not wait for the API call that the button triggers to complete. If your test clicks the button and then immediately asserts the result of the API call, you have a race condition that auto-waiting cannot prevent.
The Testing Pyramid Still Applies
Playwright tests are end-to-end tests by nature. They exercise the full stack -- browser, front-end application, API layer, and database. Every layer introduces potential variability. A test that depends on all layers being fast and available will occasionally fail when any single layer is slow or unavailable.
CI Environments Are Different from Your Laptop
Playwright tests that pass locally may fail in CI due to differences in available resources, network configurations, or display settings. CI runners typically have fewer CPU cores, less memory, and no GPU acceleration, all of which affect browser rendering speed and test reliability.
Common Causes and Fixes
Cause 1: Fragile Selectors
The most common cause of flaky Playwright tests is using selectors that are not stable across renders or code changes.
Problem: CSS class selectors that change with builds// FLAKY: CSS class names may change with build tools (CSS modules, Tailwind)
await page.click('.btn-primary-2xl-variant-a');
CSS class names generated by CSS modules, Tailwind CSS, or styled-components can change between builds. A selector that works today may break tomorrow without any intentional code change.
Problem: XPath selectors that depend on DOM structure// FLAKY: Breaks if any element is added/removed in the DOM hierarchy
await page.click('/html/body/div[2]/main/div[1]/form/button[3]');
Absolute XPath selectors are brittle because any change to the DOM structure -- even adding a wrapper div for styling -- breaks the selector.
Fix: Use resilient locator strategiesPlaywright provides several locator strategies designed for stability. Use them in this order of preference.
// BEST: Role-based locators (resilient to implementation changes)
await page.getByRole('button', { name: 'Submit Order' });
// GOOD: Test ID locators (explicitly stable)
await page.getByTestId('submit-order-button');
// GOOD: Text-based locators (tied to user-visible content)
await page.getByText('Submit Order');
// GOOD: Label-based locators (for form inputs)
await page.getByLabel('Email Address');
// GOOD: Placeholder-based locators
await page.getByPlaceholder('Enter your email');
// OK: CSS selectors with data attributes
await page.locator('[data-testid="submit-order"]');
// AVOID: Generated CSS classes
// AVOID: Absolute XPath
// AVOID: Positional selectors (nth-child, etc.)
Best practice: Add data-testid attributes to your components
// Your test
await page.getByTestId('submit-order').click();
This creates a stable contract between your test and your component that is unaffected by styling changes, content changes (for non-text selectors), or DOM restructuring.
Cause 2: Not Waiting for Application State
Playwright's auto-waiting handles element-level waits, but application-level state changes require explicit waiting.
Problem: Asserting before data loads// FLAKY: Navigation completes before the API response arrives
await page.goto('https://app.example.com/dashboard');
// The dashboard is rendered but the data hasn't loaded yet
const revenue = await page.textContent('#total-revenue');
expect(revenue).toBe('$42,500');
Fix: Wait for the specific condition you are asserting
// STABLE: Wait for the network request to complete
await page.goto('https://app.example.com/dashboard');
// Option 1: Wait for the API response
await page.waitForResponse(
response => response.url().includes('/api/revenue') && response.status() === 200
);
const revenue = await page.textContent('#total-revenue');
expect(revenue).toBe('$42,500');
// Option 2: Use Playwright's built-in assertion retries
await page.goto('https://app.example.com/dashboard');
await expect(page.locator('#total-revenue')).toHaveText('$42,500');
The second approach is preferred because expect(locator).toHaveText() is a retrying assertion. Playwright will keep checking the element's text content until it matches or the timeout expires. This is more resilient than a one-time check.
Cause 3: Improper Navigation Handling
Navigation-related flakiness is extremely common, especially in single-page applications where "navigation" may or may not involve an actual page load.
Problem: Clicking a link that triggers client-side routing// FLAKY: page.waitForNavigation may or may not fire for SPA navigation
await Promise.all([
page.waitForNavigation(),
page.click('a[href="/settings"]'),
]);
In a single-page application, clicking a link may not trigger a traditional navigation event. The waitForNavigation call may hang until timeout or may resolve immediately without the page content having actually changed.
// STABLE: Wait for the content that should appear after navigation
await page.click('a[href="/settings"]');
await expect(page.getByRole('heading', { name: 'Settings' })).toBeVisible();
This approach works regardless of whether the navigation is a full page load or a client-side route change.
Fix for actual page navigation:// STABLE: Use waitForURL for real navigations
await page.click('a[href="/settings"]');
await page.waitForURL('**/settings');
Cause 4: Animation and Transition Interference
Modern web applications use animations extensively. Animations can interfere with Playwright's ability to click elements, read text, or take screenshots.
Problem: Clicking an element during animation// FLAKY: The modal is animating into view, click may miss
await page.click('#modal-confirm-button');
Even though Playwright waits for the element to be visible, "visible" does not mean "finished animating." If the element is mid-animation, a click may land in the wrong position or the element may not be fully interactive.
Fix: Wait for animations to complete// Option 1: Wait for the element to be stable (no ongoing animations)await page.locator('#modal-confirm-button').click({ force: false });
// Option 2: Disable animations entirely in tests
await page.addStyleTag({
content:
, ::before, *::after {
animation-duration: 0s !important;
animation-delay: 0s !important;
transition-duration: 0s !important;
transition-delay: 0s !important;
}
});
// Option 3: Configure in playwright.config.ts
// playwright.config.ts
export default defineConfig({
use: {
// Reduce motion to avoid animation flakiness
reducedMotion: 'reduce',
},
});
Disabling animations in tests is a widely recommended practice. It eliminates an entire category of flakiness with minimal impact on test coverage.
Cause 5: Viewport and Layout Sensitivity
Tests that depend on specific viewport sizes or responsive behavior are vulnerable to flakiness when the viewport varies between environments.
Problem: Element hidden on smaller viewports// FLAKY: The sidebar navigation might be collapsed on the CI runner's viewport
await page.click('#sidebar-menu-item-settings');
Fix: Set explicit viewport sizes
// playwright.config.ts
export default defineConfig({
use: {
viewport: { width: 1280, height: 720 },
},
});
// Or per-test:
test('desktop navigation', async ({ page }) => {
await page.setViewportSize({ width: 1280, height: 720 });
// Now the sidebar is guaranteed to be visible
await page.click('#sidebar-menu-item-settings');
});
Cause 6: Network Timing Variability
Tests that depend on real network requests are inherently flaky because network timing varies.
Problem: Test depends on real API responses// FLAKY: API response time varies, may exceed default timeout
test('loads user profile', async ({ page }) => {
await page.goto('/profile');
await expect(page.getByText('John Doe')).toBeVisible();
});
Fix: Mock network requests for deterministic behavior
// STABLE: Mock API responses for consistent behavior
test('loads user profile', async ({ page }) => {
// Intercept the API call and return a mock response
await page.route('**/api/profile', async route => {
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
name: 'John Doe',
email: 'john@example.com',
role: 'Admin'
}),
});
});
await page.goto('/profile');
await expect(page.getByText('John Doe')).toBeVisible();
});
Network mocking makes tests faster and more reliable. The trade-off is that you are not testing the real API integration -- but that should be covered by separate API-level tests, not by every UI test.
When you do need real network requests:// Use increased timeouts and proper waiting
test('loads user profile from real API', async ({ page }) => {
await page.goto('/profile');
// Wait for the specific API call to complete
const response = await page.waitForResponse(
resp => resp.url().includes('/api/profile') && resp.status() === 200,
{ timeout: 15000 }
);
// Now assert
await expect(page.getByText('John Doe')).toBeVisible();
});
Cause 7: File Upload and Download Timing
File operations are a common source of flakiness because they depend on filesystem timing.
Problem: Asserting download before it completes// FLAKY: Download might not be complete when we check the file
test('downloads report', async ({ page }) => {
await page.click('#download-report');
// File might not be on disk yet!
expect(fs.existsSync('/downloads/report.csv')).toBe(true);
});
Fix: Use Playwright's download handling
// STABLE: Wait for the download event
test('downloads report', async ({ page }) => {
const downloadPromise = page.waitForEvent('download');
await page.click('#download-report');
const download = await downloadPromise;
// Wait for download to complete
const path = await download.path();
expect(path).toBeTruthy();
// Verify file content
const content = fs.readFileSync(path, 'utf-8');
expect(content).toContain('Revenue');
});
Cause 8: Popup and Dialog Handling
Dialogs and popups must be handled before they appear. If you set up a dialog handler after the action that triggers the dialog, you have a race condition.
Problem: Dialog handler set up too late// FLAKY: The dialog might fire before the handler is registered
test('confirms deletion', async ({ page }) => {
await page.click('#delete-account');
// Too late! The dialog already appeared and blocked execution
page.on('dialog', dialog => dialog.accept());
});
Fix: Set up the handler before the triggering action
// STABLE: Handler is ready before the dialog appears
test('confirms deletion', async ({ page }) => {
// Set up handler FIRST
page.on('dialog', dialog => dialog.accept());
// Then trigger the action
await page.click('#delete-account');
// Wait for the result
await expect(page.getByText('Account deleted')).toBeVisible();
});
// Alternative: Use once() for a one-time handler
test('confirms deletion', async ({ page }) => {
page.once('dialog', dialog => dialog.accept());
await page.click('#delete-account');
await expect(page.getByText('Account deleted')).toBeVisible();
});
Playwright Configuration for Maximum Reliability
Your playwright.config.ts plays a crucial role in test reliability. Here is a configuration optimized for stability.
import { defineConfig, devices } from '@playwright/test';
export default defineConfig({
// Retry failed tests automatically
retries: process.env.CI ? 2 : 0,
// Run tests in parallel, but limit workers in CI
workers: process.env.CI ? 2 : undefined,
fullyParallel: true,
// Fail the build if any test.only() is left in the code
forbidOnly: !!process.env.CI,
// Global timeout for each test
timeout: 30_000,
// Assertion timeout (for expect() retrying assertions)
expect: {
timeout: 10_000,
},
// Reporter configuration
reporter: process.env.CI
? [['html'], ['junit', { outputFile: 'test-results.xml' }]]
: [['html']],
use: {
// Base URL for all tests
baseURL: process.env.BASE_URL || 'http://localhost:3000',
// Consistent viewport
viewport: { width: 1280, height: 720 },
// Reduce motion to eliminate animation flakiness
reducedMotion: 'reduce',
// Capture trace on failure for debugging
trace: 'on-first-retry',
// Capture screenshot on failure
screenshot: 'only-on-failure',
// Capture video on failure
video: 'on-first-retry',
// Navigation timeout
navigationTimeout: 15_000,
// Action timeout (click, fill, etc.)
actionTimeout: 10_000,
},
projects: [
{
name: 'chromium',
use: { ...devices['Desktop Chrome'] },
},
{
name: 'firefox',
use: { ...devices['Desktop Firefox'] },
},
{
name: 'webkit',
use: { ...devices['Desktop Safari'] },
},
],
// Start the dev server before running tests
webServer: {
command: 'npm run start',
url: 'http://localhost:3000',
reuseExistingServer: !process.env.CI,
timeout: 120_000,
},
});
Key Configuration Decisions Explained
retries: 2 in CI: Automatically retries failed tests up to 2 times. This provides resilience against environmental flakiness while still surfacing consistently broken tests. workers: 2 in CI: Limits parallelism in CI to reduce resource contention. Too many parallel browsers on a limited CI runner causes out-of-memory errors and timeouts. trace: 'on-first-retry': Captures a trace only when a test fails and is being retried. This provides debugging information without the performance overhead of tracing every test. reducedMotion: 'reduce': Tells the browser to prefer reduced motion, which most web applications respect by disabling animations. This eliminates animation-related flakiness.Advanced Anti-Flakiness Patterns
Pattern 1: Page Object Model with Built-In Waits
Encapsulate page interactions in Page Objects that include appropriate waits.
// pages/checkout-page.ts
import { Page, Locator, expect } from '@playwright/test';
export class CheckoutPage {
private page: Page;
private couponInput: Locator;
private applyCouponButton: Locator;
private discountLabel: Locator;
private orderTotal: Locator;
private submitButton: Locator;
private confirmationHeading: Locator;
constructor(page: Page) {
this.page = page;
this.couponInput = page.getByLabel('Coupon Code');
this.applyCouponButton = page.getByRole('button', { name: 'Apply Coupon' });
this.discountLabel = page.getByTestId('discount-amount');
this.orderTotal = page.getByTestId('order-total');
this.submitButton = page.getByRole('button', { name: 'Place Order' });
this.confirmationHeading = page.getByRole('heading', { name: 'Order Confirmed' });
}
async goto() {
await this.page.goto('/checkout');
// Wait for the page to be fully loaded (not just navigated)
await expect(this.orderTotal).toBeVisible();
}
async applyCoupon(code: string) {
await this.couponInput.fill(code);
await this.applyCouponButton.click();
// Wait for the discount to be applied (API call completes)
await expect(this.discountLabel).not.toHaveText('$0.00');
}
async getOrderTotal(): Promise {
return await this.orderTotal.textContent() ?? '';
}
async placeOrder() {
await this.submitButton.click();
// Wait for order confirmation (full round-trip to server)
await expect(this.confirmationHeading).toBeVisible({ timeout: 15000 });
}
}
// tests/checkout.spec.ts
import { test, expect } from '@playwright/test';
import { CheckoutPage } from '../pages/checkout-page';
test('apply coupon reduces total', async ({ page }) => {
const checkout = new CheckoutPage(page);
await checkout.goto();
await checkout.applyCoupon('SAVE20');
const total = await checkout.getOrderTotal();
expect(parseFloat(total.replace('$', ''))).toBeLessThan(100);
});
Pattern 2: API State Setup
Instead of using the UI to set up test state (slow and flaky), use API calls directly.
// helpers/api.ts
import { APIRequestContext } from '@playwright/test';
export async function createTestUser(request: APIRequestContext) {
const response = await request.post('/api/users', {
data: {
name: 'Test User',
email: test-${Date.now()}@example.com,
password: 'SecurePass123!',
},
});
return response.json();
}
export async function seedProductCatalog(request: APIRequestContext) {
await request.post('/api/admin/seed', {
data: { catalog: 'test-products' },
});
}
// tests/shopping.spec.ts
import { test, expect } from '@playwright/test';
import { createTestUser, seedProductCatalog } from '../helpers/api';
test.beforeEach(async ({ request }) => {
await seedProductCatalog(request);
});
test('user can add product to cart', async ({ page, request }) => {
const user = await createTestUser(request);
// Login via API (fast) instead of UI (slow and flaky)
await page.goto('/');
await page.evaluate((token) => {
localStorage.setItem('auth_token', token);
}, user.token);
await page.goto('/products');
await page.getByRole('button', { name: 'Add to Cart' }).first().click();
await expect(page.getByTestId('cart-count')).toHaveText('1');
});
Pattern 3: Network Interception for Slow Endpoints
For endpoints that are slow or unreliable, intercept and mock them while letting other requests pass through.
test('dashboard loads with mixed real and mocked data', async ({ page }) => {
// Mock the slow analytics endpoint
await page.route('/api/analytics/', async route => {
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
visitors: 1500,
pageViews: 4200,
bounceRate: 0.35,
}),
});
});
// Let all other API calls go through to the real server
// (no route set up = passes through)
await page.goto('/dashboard');
await expect(page.getByText('1,500 visitors')).toBeVisible();
});
Pattern 4: Retry Logic for Known Unstable Operations
For operations that are inherently unstable (e.g., third-party widget loading), use explicit retry logic.
import { test, expect } from '@playwright/test';
async function waitForThirdPartyWidget(page: Page, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
await page.waitForSelector('#third-party-widget iframe', {
state: 'attached',
timeout: 5000,
});
return; // Success
} catch (error) {
if (attempt === maxRetries) throw error;
console.log(Widget load attempt ${attempt} failed, retrying...);
await page.reload();
}
}
}
test('interacts with third-party payment widget', async ({ page }) => {
await page.goto('/checkout/payment');
await waitForThirdPartyWidget(page);
const widgetFrame = page.frameLocator('#third-party-widget iframe');
await widgetFrame.getByLabel('Card Number').fill('4242424242424242');
});
Debugging Flaky Playwright Tests
When you encounter a flaky test, Playwright provides excellent debugging tools.
Using Traces
Traces capture a complete record of what happened during a test run, including screenshots, DOM snapshots, network requests, and console logs.
# Run tests with trace enabled
npx playwright test --trace on
View the trace
npx playwright show-trace test-results/my-test/trace.zip
The trace viewer shows a timeline of every action, assertion, and network request. You can step through the test and see exactly what the page looked like at each point, which is invaluable for understanding why a flaky test failed.
Using the Playwright Inspector
For interactive debugging, use the Playwright Inspector.
# Run tests with the inspector
npx playwright test --debug
Or set PWDEBUG environment variable
PWDEBUG=1 npx playwright test
Analyzing Failure Screenshots
Configure Playwright to capture screenshots on failure (which the recommended config above does). Compare failure screenshots across multiple flaky failures to identify visual patterns.
Integrating DeFlaky with Playwright
DeFlaky integrates with Playwright's JUnit reporter to track test reliability over time. This helps you identify which Playwright tests are flaky and prioritize fixes.
// playwright.config.ts
export default defineConfig({
reporter: [
['html'],
['junit', { outputFile: 'playwright-results.xml' }],
],
// ... other config
});
# After running tests, analyze results with DeFlaky
npx playwright test
deflaky analyze --input playwright-results.xml --format junit
View flakiness trends on the dashboard
deflaky dashboard --open
DeFlaky tracks each test's pass/fail rate across runs and surfaces tests whose flakiness rate exceeds your configured threshold. For Playwright tests specifically, it can correlate flakiness with browser type, helping you identify tests that are only flaky in specific browsers.
A Reliability Checklist for Playwright Tests
Use this checklist when writing or reviewing Playwright tests.
Selectors
- [ ] Use role-based or test-id selectors instead of CSS classes
- [ ] Avoid absolute XPath selectors
- [ ] Avoid positional selectors (nth-child, nth-of-type)
- [ ] Add data-testid attributes to interactive elements
Waiting
expect(locator).toHaveText()) instead of one-time checks- [ ] Wait for API responses before asserting data-dependent content
- [ ] Wait for destination content instead of navigation events
- [ ] Set appropriate timeouts for slow operations
Isolation
- [ ] Each test creates its own test data
- [ ] Tests do not depend on execution order
test.describe blocks for logical grouping, not for shared state- [ ] Clean up test data in afterEach hooks
Network
- [ ] Mock external API calls when testing UI behavior
- [ ] Use real API calls only when testing integration
- [ ] Set appropriate timeouts for network-dependent operations
- [ ] Handle network errors gracefully in tests
Configuration
- [ ] Set explicit viewport size
- [ ] Use reducedMotion to disable animations
- [ ] Configure retries for CI (2 retries recommended)
- [ ] Enable trace capture on first retry
- [ ] Use JUnit reporter for test result tracking
CI/CD
- [ ] Limit parallel workers based on CI runner resources
- [ ] Use a consistent browser version (pin with Playwright's browser management)
- [ ] Start the application server as part of the test configuration
- [ ] Upload traces and screenshots as CI artifacts for debugging
Common Playwright Flakiness Patterns and Their Fixes
Here is a quick reference table of the most common patterns.
| Symptom | Root Cause | Fix |
|---------|-----------|-----|
| "Element not found" intermittently | Fragile selector | Use role/testid selectors |
| "Timeout waiting for element" | Content loads after assertion | Use retrying assertions |
| Test passes locally, fails in CI | Resource constraints | Limit workers, increase timeouts |
| Different results across browsers | Browser-specific rendering | Test per-browser, use cross-browser locators |
| Clicks have no effect | Element mid-animation | Disable animations with reducedMotion |
| "Navigation timeout" | SPA routing does not trigger page load | Wait for content, not navigation |
| Intermittent assertion failures on text | Dynamic content (timestamps, counters) | Mock dynamic data or use regex matchers |
| "Target closed" errors | Browser context closed prematurely | Check for uncaught errors causing page crashes |
| Screenshot mismatches | Font rendering differences | Use threshold in snapshot comparison |
| File download assertions fail | Download not complete | Use waitForEvent('download') |
Conclusion
Flaky Playwright tests are not inevitable. By following the patterns and practices outlined in this guide -- using resilient selectors, proper waiting strategies, network mocking, and optimized configuration -- you can build a Playwright test suite that your team trusts and relies on.
Start by auditing your current tests against the reliability checklist. Fix the most impactful issues first: fragile selectors and missing waits account for the majority of Playwright flakiness. Then progressively adopt the advanced patterns like API state setup and Page Object Models with built-in waits.
Use tools like DeFlaky to track your progress. Measuring your test suite's reliability before and after applying these fixes proves the value of the investment and helps you identify remaining problem areas.
A reliable Playwright test suite is not just less annoying -- it is a competitive advantage. It means faster feedback, more confident deployments, and more time spent building features instead of investigating false failures.