rag medium complexity mvp

RAG

Retrieval-Augmented Generation lets an LLM retrieve external knowledge before answering.

Decision

Use RAG when the model needs private, updated, source-backed, or domain-specific knowledge.

Use when

Knowledge base Q&A
Customer support over product docs
Internal document assistants
Source-backed answers with citations

Avoid when

Strict calculations
Transactional business workflows
Pure tone or style adaptation
Problems caused mainly by bad prompts

When RAG is the right first move

RAG is usually the first technical answer when the core problem is knowledge access. If the model does not know your private docs, product policies, changelog, support history, or domain corpus, retrieval gives it relevant context at answer time.

It is also useful when answers need visible evidence. A support assistant, research assistant, or internal knowledge bot often needs to show where the answer came from, not just sound plausible.

When RAG is the wrong fix

Do not reach for RAG when the real problem is workflow control, deterministic calculation, account permissions, or output formatting. Retrieval can give the model context, but it does not make the model a rules engine.

RAG also will not automatically improve bad content. If the documents are outdated, duplicated, or vague, retrieval mostly makes those weaknesses easier to surface.

Common mistakes

Treating vector search as the whole RAG system.
Chunking documents without testing answer quality.
Skipping citations and evaluation.
Adding more retrieved text when the prompt needs a clearer decision boundary.

Next decision

Compare RAG with fine-tuning when the question is whether to add knowledge or change behavior. Compare it with long context when the corpus is small enough to fit directly into the prompt.