rag low complexity mvp

Long Context

Long context means putting a large amount of information directly into the model's prompt window.

Decision

Use long context when the relevant material is small enough, stable enough, and worth sending directly.

Use when

Single-document analysis
Small research packets
Legal or policy review with bounded source material
Prototypes before building retrieval

Avoid when

Large document collections
Frequently repeated high-volume queries
Low-latency user flows
Corpora that need ranking and filtering

When long context is enough

Long context is the simplest approach when you can put the relevant source material directly into the prompt. It is especially useful for prototypes, internal tools, and single-user workflows where latency and token cost are acceptable.

It avoids indexing, chunking, embeddings, and retrieval infrastructure. That simplicity is valuable when you are still proving the product workflow.

Where it breaks

Long context can become expensive, slow, and noisy. The model may have all the information available but still miss the most relevant detail. For repeated queries over a growing corpus, retrieval usually gives better control.

Common mistakes

Sending entire documents when only a section matters.
Assuming more context always improves accuracy.
Using long context to avoid building search when search is the product problem.

Next decision

Start with long context when the corpus is bounded. Move toward RAG when you need ranking, filtering, citations, and repeatable retrieval quality.