Generative AI Use Case Selection

FlowRidge

This article describes the criteria that distinguish good Generative AI use cases from poor ones, the evaluation framework that operationalises the criteria, and the patterns of selection failure that recur across organisations.

Why Selection Matters

Three factors make use case selection particularly consequential for Generative AI.

First, opportunity cost. Generative AI talent, infrastructure, and management attention are scarce. A use case that consumes these resources without delivering proportional value crowds out better alternatives. The MIT Sloan and Boston Consulting Group ongoing research at https://sloanreview.mit.edu/big-ideas/artificial-intelligence-business-strategy/ has documented the wide variance in Generative AI program returns based primarily on use case selection quality.

Second, failure visibility. Generative AI failures are often visible to customers, employees, regulators, and the press in ways that other AI failures are not. A poorly-chosen use case that fails publicly damages the organisation’s broader AI program credibility.

Third, technology fit variability. Unlike most enterprise software, Generative AI has a non-uniform suitability map. Some tasks it does brilliantly; others it does poorly; the difference is often not predictable from the surface description of the task.

Selection Criteria

A defensible selection process applies multiple criteria.

Generative-Suitable Task Profile

The task should genuinely benefit from generative capabilities. Tasks that involve drafting unstructured content, transforming between structured and unstructured formats, summarising or extracting from text, generating code, or supporting creative work are well-suited. Tasks that require precise numerical computation, deterministic execution, or hard real-time response are usually poorly-suited.

A useful diagnostic: would a competent human do this task primarily by drafting and revising text? If yes, Generative AI is likely a candidate. If the task is primarily looking up facts, performing calculations, or executing rules, conventional approaches are usually better.

Tolerance for Probabilistic Output

The use case must tolerate non-deterministic, sometimes imperfect output. A 95-percent-correct first draft that a human reviews is often valuable; a 95-percent-correct payment authorisation is unacceptable.

Verification Pathway

The use case must include a path to verify the AI’s output before consequential action. Verification can be human review, automated checking against ground truth, or downstream consequence reversibility. Use cases without a verification path are usually poor candidates.

Sufficient Volume to Justify Investment

The use case must have sufficient volume to justify the investment in design, deployment, and governance. A high-touch use case with one-off output may be better served by direct human work assisted by general-purpose Generative AI tools.

Clear Value Hypothesis

The expected value (revenue, cost reduction, customer experience improvement, risk reduction) should be quantifiable in advance. “Improves productivity” is unfalsifiable; “reduces median document drafting time by 40 percent for legal team document templates” is testable.

Acceptable Risk Profile

The use case’s worst-plausible-failure mode should be acceptable to the organisation. A use case where hallucination produces incorrect customer-facing claims, biased decisions, or regulatory exposure may be unacceptable regardless of expected value.

Organisational Readiness

The using organisation should have the capability to integrate the AI: technical infrastructure, change management capacity, governance capability, and the operational maturity to maintain the system over time.

The U.S. National Institute of Standards and Technology AI RMF Generative AI Profile at https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook/GenAI_Profile articulates the risk dimensions that should inform the risk profile criterion.

The Evaluation Framework

A workable evaluation framework operationalises the criteria into a scoring rubric.

A common structure scores each candidate use case across six dimensions on a 1-5 scale:

Generative suitability (does the task fit the technology?)
Value (how much business value if it works?)
Implementation feasibility (can we actually build this?)
Risk (what is the worst plausible failure?)
Strategic alignment (does this advance organisational priorities?)
Organisational readiness (can we operate this once built?)

Each dimension gets a score and supporting evidence. Composite scores rank candidates; the score is input to discussion, not a substitute for it.

The intake process described in Module 1.25 captures the data needed for this evaluation. The selection process is what consumes the data.

High-Yield Use Case Categories

Several categories have produced consistent value across organisations.

Internal Knowledge Search and Retrieval

Generative AI grounded in retrieval over internal documents, policies, and knowledge bases. Substitutes searching across multiple systems with conversational query. Risk is moderate (incorrect answers possible) but mitigable through retrieval grounding.

Document Drafting Assistance

Generative AI for first drafts of common document types: emails, reports, summaries, presentations. The human reviewer retains authority; the AI accelerates the drafting cycle.

Code Assistance

Code generation, completion, review, and explanation. Discussed extensively in Module 1.30.

Content Summarisation

Summarising long documents, meeting transcripts, customer interactions, or research outputs into structured briefs. Strong value-to-risk ratio when human review is part of the workflow.

Translation and Localisation

Document and content translation, often with human review. Quality has improved substantially with modern Generative AI; cost has dropped materially.

Customer Service Tier 1

Initial customer interactions with clear escalation paths to human agents (per Module 1.29).

Data Extraction from Unstructured Sources

Pulling structured data from unstructured documents (contracts, invoices, applications). Often combines Generative AI with traditional document processing.

Lower-Yield Use Case Categories

Several categories have consistently underperformed expectations.

Replacing Expert Judgement in High-Stakes Decisions

Legal opinions, medical diagnoses, financial advice, hiring decisions. Generative AI can support, but the failure modes when it replaces expert judgement are severe.

Numerical Analysis and Calculation

Generative AI is unreliable for arithmetic, statistical reasoning, and quantitative analysis. Tasks framed as “ask the AI to calculate” usually fail; tasks that have the AI generate code or queries that perform the calculation can succeed.

Real-Time, Mission-Critical Decision-Making

Latency, reliability, and predictability requirements that Generative AI cannot meet.

Tasks With No Verification Pathway

If the AI’s output cannot be verified before consequential action, the use case is too risky for current technology.

Pure Customer-Facing Personality

Use cases positioned as “AI as company spokesperson” tend to produce embarrassing failures that overshadow any benefit.

Operational Selection Practices

Cross-Functional Selection Committee

A selection committee that combines business, technical, ethical, legal, and operational perspectives. Selection that is purely technical or purely business misses dimensions that matter.

Stage-Gate Review

Selection is not one decision; it is a series. Initial concept approval to fund discovery; discovery to fund pilot; pilot to fund production; production decisions reviewed at scale. Each stage gate evaluates against the original criteria with updated evidence.

Portfolio Balance

The portfolio should balance: high-value high-risk against quick-win low-risk; novel capability against proven pattern; centralised platform against distributed business unit experimentation.

Sunset Discipline

Use cases that fail to deliver expected value should be sunset, not allowed to continue indefinitely. The Stanford AI Index annual report at https://hai.stanford.edu/ai-index documents the high abandonment rate of Generative AI projects; a healthy portfolio acknowledges this and sunsets gracefully rather than letting projects linger.

Common Failure Modes

The first is technology-driven selection — picking use cases because Generative AI can address them, not because they matter. Counter with mandatory value hypothesis and prioritisation against other portfolio investment.

The second is executive-mandated selection — a senior executive demands a specific Generative AI use case regardless of suitability. Counter with selection committee discipline that creates space for honest evaluation.

The third is trend-following selection — adopting use cases because peer organisations are doing them, without local fit analysis. Counter with explicit local context evaluation.

The fourth is under-evaluation of organisational readiness — picking use cases the organisation cannot actually operate after deployment. Counter with explicit readiness assessment.

Looking Forward

The next article in Module 2.21 turns to retrieval-augmented generation architecture — the dominant pattern for grounding Generative AI in organisational data. Once a use case is selected, the architecture choice is the next high-leverage decision.