14.2 UI option A: CLI tool

On this page

Goal: a usable CLI summarizer
CLI contract (inputs, outputs, exit codes)
Repo structure (recommended)
Implementation plan (walking skeleton)
Testing the CLI (without flakiness)
Prompt sequence to build it
Ship points for the CLI path
Where to go next

Goal: a usable CLI summarizer

The CLI option is the fastest path to a real app because it’s:

easy to run locally,
easy to test,
easy to iterate without front-end complexity.

Your CLI is a “truth machine”: it lets you quickly validate prompts, schemas, and error handling.

Start CLI-first even if you want a web app

Many teams prototype the core pipeline as a CLI, then wrap it in a web UI later. It keeps early loops fast and evidence-based.

CLI contract (inputs, outputs, exit codes)

Define a small CLI interface. A good v1 contract:

Inputs

Article text: via --file or stdin.
Optional title: --title.
Output format: --format json|md (optional; start with JSON only if you want to keep scope tight).

Outputs

stdout: the summary (JSON or formatted text).
stderr: error messages and diagnostics (safe to show).

Exit codes

0: success
2: validation error (empty input, too long input)
3: model call failed (timeout/rate_limit/blocked/invalid_output)
1: unexpected/internal error

Exit codes make behavior testable

They turn “it failed” into “it failed in category X,” which is exactly what you need for reliable debugging and automation.

Repo structure (recommended)

Keep the structure aligned with Section 13:

summarizer/
  README.md
  .gitignore
  .env.example
  src/
    summarizer/
      __init__.py
      __main__.py
      cli.py
      app.py              # domain layer: summarize_article(...)
      llm/
        client.py
        prompts/
          system.md
          summarize_v1.md
        schemas/
          summarize_v1.json
  tests/
    test_cli_smoke.py
    test_schema_validation.py

Even if you’re not using Python, the boundaries are what matter: CLI entrypoint, app logic, LLM adapter, prompts, schemas, tests.

Implementation plan (walking skeleton)

Build this in a sequence that keeps it runnable:

SP1: CLI reads input and prints a placeholder JSON object (no model call yet).
SP2: LLM adapter exists and is called, but output validation is strict and failures are handled.
SP3: schema validation in place + one retry strategy for invalid output.
SP4: timeouts, retries, and logging implemented.

This ensures you always have a runnable project between changes.

Do not start with “perfect prompts”

Start with a working pipeline. Then tighten prompts and schemas. You can’t optimize what you can’t run.

Testing the CLI (without flakiness)

For reliable tests, separate:

unit tests: schema validation, input validation, formatting
integration tests: CLI wiring (but with a fake LLM client)

Use a fake LLM client in tests

Unit tests should not call the real model API. They should inject a fake client that returns a deterministic result. This gives you:

fast tests,
deterministic behavior,
no costs,
no rate limits.

Golden tests for schema shape

Keep one or two golden JSON examples that represent “correct shape.” Validate that your parser/validator enforces required fields and handles missing fields as errors.

You’re testing your boundary, not the model

Unit tests confirm that your app can handle both valid and invalid outputs. This is what makes LLM integration safe.

Prompt sequence to build it

This is a reliable sequence you can reuse in AI Studio:

Prompt A: plan + file tree (no code yet)

We are building Project 1: a CLI article summarizer with structured JSON output.

Constraints:
- Keep it small and runnable
- No secrets in prompts or logs
- Use a single LLM wrapper module
- Prompts and schemas live in files and are versioned
- Diff-only changes once code exists

Task:
1) Propose a file tree for the repo.
2) Propose the JSON schema for v1.
3) Propose acceptance criteria and ship points.
Stop after the plan (no code yet).

Prompt B: scaffold (stubs + README + schema file)

Now scaffold the repo as files.

Requirements:
- Include README with run/test commands
- Include schema file (summarize_v1.json)
- Include CLI stub that reads from --file or stdin
- Do NOT implement real model calls yet (placeholder ok)

Output:
- File tree then file contents

Prompt C: implement walking skeleton (diff-only)

Implement the walking skeleton:
- CLI reads input and returns JSON matching the schema
- Use a fake placeholder summary for now
- Add a minimal smoke test

Output: diff-only changes

Prompt D: add LLM adapter + validation

Add the LLM adapter and strict schema validation.

Constraints:
- All model calls go through the adapter module only
- Add timeouts and basic error categories
- Add tests using a fake LLM client (no real calls in tests)

Output:
1) Short plan
2) Diff-only changes

Ship points for the CLI path

SP1: CLI runs; prints valid JSON for a sample input.
SP2: schema validation implemented; invalid output returns exit code 3.
SP3: timeouts/retries implemented; errors categorized.
SP4: prompts versioned; logs include prompt version and request id.

Goal: a usable CLI summarizer

CLI contract (inputs, outputs, exit codes)

Inputs

Outputs

Exit codes

Repo structure (recommended)

Implementation plan (walking skeleton)

Testing the CLI (without flakiness)

Use a fake LLM client in tests

Golden tests for schema shape

Prompt sequence to build it

Prompt A: plan + file tree (no code yet)

Prompt B: scaffold (stubs + README + schema file)

Prompt C: implement walking skeleton (diff-only)

Prompt D: add LLM adapter + validation

Ship points for the CLI path

Where to go next