22.2 Building a "document Q&A" assistant prototype

Overview and links for this section of the guide.

Goal: a grounded Q&A prototype (fast)

You want a “document Q&A” assistant that answers questions about a set of docs. In prototype mode, your goal is:

prove that Q&A is useful for your users with a small, grounded loop—before building infrastructure.

This page focuses on prototyping in AI Studio with manual or lightweight retrieval. Part VIII goes deep on building a real RAG system.

Scope: what “prototype” means here

In prototype mode, you’re optimizing for learning:

  • Inputs: a small set of documents (or a few sections).
  • Retrieval: manual selection of relevant chunks, or a simple keyword search.
  • Outputs: answers with citations (chunk ids + quotes) and explicit uncertainty.
  • Evaluation: a small set of questions that matter.

What you are not doing yet:

  • building an embeddings store,
  • handling massive corpora,
  • solving multi-user access control,
  • production-grade observability.

Prototype loop in AI Studio

  1. Pick a narrow doc set: one policy, one manual, one API doc—whatever matters.
  2. Create a chunk index: split into 10–30 chunks and label them with stable ids.
  3. Answer questions using only provided chunks: paste a few relevant chunks and ask a question.
  4. Demand citations: require chunk ids and short quotes.
  5. Evaluate: run your small eval questions and record failures.
  6. Iterate: fix chunking, grounding rules, and output schema.

This loop gives you signal on whether you should invest in retrieval infrastructure.

Grounding rules that prevent confident nonsense

Use rules that make hallucination expensive:

  • Answer only from sources: “If it’s not in the chunks, say ‘not found.’”
  • Quote evidence: require short quotes for each key claim.
  • Keep answers bounded: “max 7 bullets; no extra background.”
  • Ask clarifying questions: “If the question is ambiguous, ask one clarifying question before answering.”
  • Separate answer vs speculation: disallow speculation entirely in prototype mode.
Do not allow the model to “fill in gaps”

Users will trust the answer. If the sources don’t contain it, the correct output is “not found” plus what to search next.

Export path: from prototype to codebase

Once the prototype is useful:

  1. Save your prompt as a file: treat it as versioned configuration.
  2. Define a schema: structured answer + citations + “not found” behavior.
  3. Build a tiny retrieval layer: start with keyword search, then embeddings.
  4. Wire in evaluation: keep your question set and run it in CI.

This path is the bridge into Part VIII (RAG engineering).

Copy-paste prompts

Prompt: grounded Q&A with citations

You are a document Q&A assistant. You MUST follow these rules:
- Use ONLY the provided document chunks as sources.
- If the answer is not in the chunks, say "NOT FOUND" and list which chunk ids you searched.
- For every key claim, include a citation with chunk_id and a short quote.
- Do not add extra background.

Chunks:
[chunk_id: ...]
```text
...
```

Question: [user question]

Return JSON:
{
  "answer": string,
  "citations": [{ "chunk_id": string, "quote": string }],
  "not_found": boolean,
  "follow_up_question": string|null
}

Ship points

  • Ship point 1: answers are consistently grounded (citations point to real text).
  • Ship point 2: “not found” behavior is reliable and not annoying.
  • Ship point 3: you have an eval set and can detect regressions.

Where to go next