rag low complexity mvp

Long Context

Long context means putting a large amount of information directly into the model's prompt window.

Decision

Use long context when the relevant material is small enough, stable enough, and worth sending directly.

Use when

  • Single-document analysis
  • Small research packets
  • Legal or policy review with bounded source material
  • Prototypes before building retrieval

Avoid when

  • Large document collections
  • Frequently repeated high-volume queries
  • Low-latency user flows
  • Corpora that need ranking and filtering

When long context is enough

Long context is the simplest approach when you can put the relevant source material directly into the prompt. It is especially useful for prototypes, internal tools, and single-user workflows where latency and token cost are acceptable.

It avoids indexing, chunking, embeddings, and retrieval infrastructure. That simplicity is valuable when you are still proving the product workflow.

Where it breaks

Long context can become expensive, slow, and noisy. The model may have all the information available but still miss the most relevant detail. For repeated queries over a growing corpus, retrieval usually gives better control.

Common mistakes

  1. Sending entire documents when only a section matters.
  2. Assuming more context always improves accuracy.
  3. Using long context to avoid building search when search is the product problem.

Next decision

Start with long context when the corpus is bounded. Move toward RAG when you need ranking, filtering, citations, and repeatable retrieval quality.