Stop guessing.
DeFlaky your tests.
Detect flaky tests in seconds with AI-powered root cause analysis. BYOK — bring your own AI key. Push results to the dashboard to track trends, get alerts, and fix reliability issues before they tank your pipeline.
$npx deflaky-cli --command "npx playwright test" --runs 5
Running test suite... (5 iterations)
42 tests stable (100% pass rate)
3 tests FLAKY
login.spec.ts > "should redirect" — 3/5 passed
cart.spec.ts > "update quantity" — 4/5 passed
search.spec.ts > "show suggestions" — 2/5 passed
1 test consistently failing
FlakeScore: 93.5 / 100
$ _
See exactly what's flaky
DeFlaky generates a detailed report after every run — FlakeScore, per-test pass rates, and AI-powered root cause analysis.
DeFlaky Test Report
Run ID: d7f42a · Apr 8, 2026, 2:15 PM · Playwright
FlakeScore
0%
Total Duration
0s
Total Tests
0
Stable
0
Flaky
0
| Status | Test Name | Results | Pass Rate |
|---|---|---|---|
| flaky | should complete checkout flow | 3✓ 2✗ | 60% |
| flaky | should handle OAuth redirect | 4✓ 1✗ | 80% |
| flaky | should load dashboard charts | 4✓ 1✗ | 80% |
| passed | should render login page | 5✓ 0✗ | 100% |
| passed | should create new project | 5✓ 0✗ | 100% |
AI Root Cause Analysis
Checkout flow flakiness caused by race condition in payment confirmation — Stripe webhook response time varies 200ms-3s. Suggested fix: add explicit wait with 5s timeout.
Works with every test framework
Everything you need to kill flaky tests
From detection to resolution, DeFlaky gives you full visibility into your test suite's reliability.
Instant Detection
Run your suite N times with a single command. DeFlaky identifies flaky tests by comparing results across runs — no config required.
FlakeScore Dashboard
Track your test suite's reliability score over time. See which tests are getting worse and which are improving.
Slack & Email Alerts
Get notified the moment a new flaky test appears. Stop finding out about flaky tests from angry developers.
Framework Agnostic
Works with any test runner that outputs JUnit XML or JSON reports. Playwright, Selenium, Cypress, Jest, Pytest — all supported.
CI/CD Ready
Add DeFlaky to your GitHub Actions, Jenkins, or GitLab CI pipeline. Fail builds when flakiness exceeds your threshold.
Team Collaboration
Share flakiness reports with your team. Assign owners to flaky tests. Track who fixed what and when.
AI Root Cause Analysis
AI analyzes your stack traces and tells you exactly why a test failed — infrastructure, app bug, test code, or flaky.
AI Failure Categorization
Automatically classify every failure. No more manual triage. Works with any LLM provider you choose.
AI-Powered Analysis
DeFlaky uses AI to analyze your test failures, identify root causes, and suggest fixes — automatically.
Root Cause:
Test relies on Math.random() producing non-deterministic values. Each run generates a different number, causing intermittent assertion failures.
Suggested Fix:
jest.spyOn(Math, 'random')
.mockReturnValue(0.75);
Bring Your Own Key
Use any LLM provider you prefer. Your API key never leaves your browser. We never store or log it.
Anthropic (Claude)
Best for code analysis
OpenAI (GPT-4o)
Great all-around
Groq
Ultra-fast, free tier
OpenRouter
100+ models, one API
Ollama
Run locally, 100% private
Three steps to reliable tests
Install the CLI
$ npm install -g deflaky-cliRun detection
$ deflaky --command "npx playwright test" --runs 5Push to dashboard
$ deflaky --command "npx playwright test" --runs 5 --push --token YOUR_TOKENSimple, transparent pricing
100% free during launch. CLI is open source forever.
CLI
- ✓Unlimited local runs
- ✓Terminal reports
- ✓JUnit XML & JSON support
- ✓All frameworks supported
- ✓Open source (MIT)
Dashboard
- ✓Everything in CLI +
- ✓Unlimited projects
- ✓90-day history
- ✓FlakeScore trends
- ✓Email & Slack alerts
Pro
- ✓Everything in Dashboard +
- ✓AI Root Cause Analysis
- ✓AI Failure Categorization
- ✓Unlimited history
- ✓Priority support
Your tests aren't flaky.
They're unmonitored.
Join QA engineers who ship with confidence.