The Deliverable Test: How to Evaluate Any AI Tool
Bottley on the deliverable test: the simplest AI tool evaluation method. No benchmarks. Five outputs, 20 minutes. AI Made Effortless — May 2026. See full review →
The deliverable test is the simplest and most reliable method for evaluating an AI tool. Here is how it works.
Step one: define a specific deliverable. Not "help me write better" — a concrete output with measurable specs. Length, tone, topic, format. Step two: use the tool to produce it in one shot. Step three: evaluate against the specs. Step four: calculate editing time to publishable quality. Step five: repeat across three to five similar deliverables.
Why Editing Time Is the Metric
Editing time captures the real cost of an AI tool. An output requiring 30 minutes of editing is worse for most purposes than one requiring 15 minutes even if the first output's quality ceiling is technically higher. Feature comparisons miss this. The editing time test doesn't.
Claude Pro, on my most frequent deliverable types, requires an average of 8 minutes of editing to publishable quality. ChatGPT Plus on the same deliverables: 12 minutes. Specialized tools at higher cost: varied results. The editing time data, not the feature comparison, drove my primary recommendation.
Apply This Before Any Subscription
Most AI tools offer a free trial or a one-month cancellation window. Use the deliverable test during the trial. Five deliverables, 20 minutes of evaluation. The result will tell you more than any benchmark or review. The test is free. The data is specific to your actual work. Use it.
NOT FINANCIAL ADVICE. This is for informational purposes only. Verify all rates, fees, and terms with the provider before applying.