RAG Systems · Sciveflow

Playbook

RAG Systems

Retrieval that stays reliable as your docs change.

A working RAG system is more than embeddings. Treat ingestion, retrieval, and answer assembly as separate, testable layers.

Ingestion + chunking

Preserve structure and provenance.

Normalize sources (PDF, DOCX, HTML) into a stable schema
Chunk by headings or semantic boundaries, not fixed length
Store metadata for ownership, access, and timestamps
Version documents so you can roll back or compare

Retrieval strategy

Get the right context before generation.

Hybrid search (BM25 + vector) beats pure embeddings
Use metadata filters and access control in retrieval
Rerank top results with a lightweight model
Cache frequent queries and keep a freshness window

Answer assembly

Citations are not optional in production.

Prompt with explicit citation requirements
Refuse when sources are missing or low confidence
Use a strict answer schema to avoid drift

Failure modes

Stale or missing docs leading to hallucinations
Overfetching irrelevant context
Conflicting sources without disambiguation
No visibility into retrieval quality

Checklist

Test set that covers top queries and edge cases
Retrieval quality dashboard
Citation enforcement in prompts
Access filters at retrieval time