RAG 2026: naif retrieval'ın ötesi

01Uzun bağlam RAG'i neden öldürmediWhy long context didn't kill RAG

Milyon token'lık bağlam pencereleri geldiğinde tahmin hep aynıydı: "Her şeyi prompt'a koyarız, retrieval biter." Olmadı. Çünkü retrieval bir geçici çözüm değil, veri erişim katmanıdır; uzun bağlam onu ortadan kaldırmadı, üstündeki beklentiyi yükseltti. Dört gerçek bu katmanı ayakta tutuyor:

When million-token context windows arrived, the prediction was always the same: "We'll put everything in the prompt; retrieval is over." It didn't happen. Retrieval is not a workaround — it's a data-access layer, and four facts keep it standing:

Sorgu başına maliyet. Her soruda korpusun tamamını modele okutmak, maliyeti belge sayısıyla ölçekler. Retrieval yalnızca ilgili parçaları taşır; fark her sorguda tekrar ödenir.
Cost per query. Feeding the whole corpus to the model on every question scales cost with document count. Retrieval carries only the relevant chunks; the difference is paid again on every single query.
Tazelik. Bağlama gömülü bilgi, onu güncellediğiniz gün kadar tazedir. Index'e yeni belge eklemek dakikalar sürer; dev bir prompt'u yeniden derlemek bir süreçtir.
Freshness. Knowledge baked into a context is only as fresh as the day you rebuilt it. Adding a document to an index takes minutes; recompiling a giant prompt is a process.
Erişim yetkisi. Farklı kullanıcı farklı belgeyi görebilir. Tek dev bağlamda belge bazlı yetki uygulanamaz; filtre retrieval katmanında yaşar.
Access control. Different users are allowed to see different documents. Per-document permissions can't be enforced inside one monolithic context; the filter lives in the retrieval layer.
Kaynak gösterme. "Bu cevap hangi belgeden geldi?" sorusuna retrieval'ın döndürdüğü parçalar cevaptır. Sekiz yüz sayfalık bağlamda attribution tahmine dönüşür.
Attribution. The chunks retrieval returned are the answer to "which document did this come from?". Across an eight-hundred-page context, attribution turns into guesswork.

02Naiften hibrite, hibritten planlamayaFrom naive to hybrid, from hybrid to planning

İlk nesil desen basitti: sorguyu embed et, en yakın k parçayı al, prompt'a koy. Soru tek bir pasajla birebir eşleşiyorsa bugün de çalışır — sorun şu ki gerçek soruların çoğu eşleşmez.

The first-generation pattern was simple: embed the query, take the top-k nearest chunks, stuff the prompt. It still works when a question maps one-to-one onto a passage — the problem is that most real questions don't.

İkinci basamak hibrit arama: BM25 + vektör + rerank. Vektör benzerliği anlamı yakalar ama "TZ-4810B" gibi kodlarda, özel isimlerde ve nadir terimlerde kaybolur; sözcük eşleşmesi tam tersini yapar. İkisini birlikte koşup bir reranker ile sıralamak, üretimdeki makul varsayılandır. Bu katmanın temellerini — chunking, eval seti, gözlemlenebilirlik — RAG üretim yazısında anlatmıştık.

The second rung is hybrid search: BM25 + vectors + a reranker. Vector similarity captures meaning but gets lost on codes like "TZ-4810B", proper names and rare terms; lexical matching does the exact opposite. Running both and reranking is the sane production default. We covered the fundamentals of this layer — chunking, eval sets, observability — in our RAG-in-production post.

Üçüncü basamak sorgu planlama. Model sorguyu yeniden yazar, "X ile Y'yi üç kritere göre karşılaştır" gibi çok koşullu bir soruyu alt sorgulara böler ve sonuçları birleştirir. Multi-hop bunun zincirli hâli: ilk retrieval'ın cevabı ikinci sorguyu besler — önce tedarikçiyi bulursunuz, sonra o tedarikçinin sertifikasını ararsınız.

The third rung is query planning. The model rewrites the query, decomposes a multi-constraint question like "compare X and Y on three criteria" into sub-queries, and merges the results. Multi-hop is the chained version: the answer of the first retrieval feeds the second — first you find the supplier, then you search for that supplier's certificate.

Retrieval artık bir fonksiyon çağrısı değil, bir plan.

Retrieval is no longer a function call. It's a plan.

03Agentic RAG ve graph tabanlı bilgiAgentic RAG and graph-backed knowledge

Agentic RAG, planı sabit bir boru hattından çıkarıp modele verir: getir, oku, neyin eksik olduğuna karar ver, yeni bir sorgu at, kanıt yeterince toplanınca dur. Kaynağı belirsiz, açık uçlu sorularda — "bu konuda elimizde ne var?" — döngü tek atıştan belirgin iyidir. Bedeli de nettir: her iterasyon bir model çağrısıdır; gecikme ve maliyet adım sayısıyla çarpılır. Adım bütçesi ve durma kriteri olmayan bir agentic RAG, üretime hazır değildir; kullanıcı cevabı beklerken döngü keşif yapmaya devam eder ve fatura sessizce büyür.

Agentic RAG takes the plan out of a fixed pipeline and hands it to the model: retrieve, read, decide what's missing, issue a new query, stop when the evidence is sufficient. On open-ended questions with no obvious source — "what do we have on this?" — the loop clearly beats one shot. The cost is equally clear: every iteration is a model call; latency and spend multiply with step count. An agentic RAG without a step budget and a stop criterion is not production-ready.

Graph tabanlı yaklaşımlar başka bir boşluğu doldurur. İlişki yoğun korpuslarda — sözleşmeler, organizasyon şemaları, bağımlılık ağları — vektör araması "kim kime bağlı?" sorusunu göremez; benzerlik, ilişki değildir. GraphRAG tarzı sistemler belge kümesinden entity ve ilişki çıkarır, üstüne hiyerarşik özetler kurar; hem çok sekmeli ilişki sorularına hem "bu korpusun genelinde hangi temalar var?" gibi bütünsel sorulara cevap verebilir. Bedeli kurulum ve bakımdır: bir çıkarım pipeline'ı yazarsınız ve belgeler değiştikçe graph'ı güncel tutarsınız. Statik bir korpusta bu yatırım geri döner; günde bin belgenin değiştiği sistemde sürekli bir işletme maliyetidir.

Graph-based approaches fill a different gap. In relationship-heavy corpora — contracts, org charts, dependency networks — vector search can't see "who is connected to what?"; similarity is not a relationship. GraphRAG-style systems extract entities and relations from the document set and build hierarchical summaries on top; they answer both multi-hop relationship questions and corpus-wide ones like "what themes run through this whole collection?". The price is build and maintenance: you write an extraction pipeline and keep the graph current as documents change. On a static corpus the investment pays back; in a system where a thousand documents change daily, it's a standing operational cost.

04Karar rehberi: hangi problem, hangi desenDecision guide: which problem, which pattern

Desen seçimi zevk meselesi değil, problem şekli meselesidir:

Pattern choice is not a matter of taste; it's a matter of problem shape:

karar-rehberi.yaml — desen seçimi / pattern choice

# problem şekli → desen / problem shape → pattern
"tek pasajlık gerçek sorusu":   hybrid + rerank
"SKU, kod, özel isim":           BM25 katmanı zorunlu / mandatory
"karşılaştırma, çok koşul":      query decomposition
"cevap cevabı besliyor":         multi-hop
"açık uçlu araştırma":           agentic  # max_steps + durma kriteri / stop criterion
"ilişki yoğun korpus":           graph / GraphRAG

İkinci kural sırayı belirler: eval setinizin geçtiği en basit desenle başlayın, ancak skor tıkandığında bir üst basamağa çıkın. Agentic döngü, hibrit aramanın çözdüğü bir problemi on kat gecikmeyle çözer; graph, ilişkisi olmayan bir korpusta boş bir masraftır. Karmaşıklık satın alınır — bedava gelmez ve iade edilmesi zordur.

The second rule sets the order: start with the simplest pattern that passes your eval set, and climb a rung only when the score plateaus. An agentic loop solves a problem hybrid search already solves — at ten times the latency; a graph over a corpus with no relationships is pure overhead. Complexity is bought — it doesn't come free, and it's hard to return.

Kısa liste. Varsayılan hibrit + rerank ✓ · çok koşullu soruya decomposition ✓ · açık uçlu işe agentic + adım bütçesi ✓ · ilişki yoğun korpusa graph ✓ · her kademede eval seti ✓. Eval'siz desen değiştirmek, pusulasız rota değiştirmektir.Default to hybrid + rerank ✓ · decomposition for multi-constraint questions ✓ · agentic + step budget for open-ended work ✓ · graph for relationship-heavy corpora ✓ · an eval set at every rung ✓. Switching patterns without an eval is changing course without a compass.

RAGAgentic RAGGraphRAGHybrid Search

RAG 2026: naif retrieval'ın ötesiRAG in 2026: beyond naive retrieval

01Uzun bağlam RAG'i neden öldürmediWhy long context didn't kill RAG

02Naiften hibrite, hibritten planlamayaFrom naive to hybrid, from hybrid to planning

03Agentic RAG ve graph tabanlı bilgiAgentic RAG and graph-backed knowledge

04Karar rehberi: hangi problem, hangi desenDecision guide: which problem, which pattern