What is the RAG?

it retrieved relevant information from the external source such as doc, DB, Website and using LLM it generates the answer based on retrieved data.

What RAG solves problems?

LLMs sometimes generate confident but incorrect answers because:

🛠 How RAG helps:

By retrieving real, relevant information from external sources, RAG grounds the LLM's answers in actual facts, reducing hallucinations.

LLMs like GPT-3.5 or GPT-4 were trained on data that stops at a certain point (e.g., 2023). They can’t know things published after that.

🛠 How RAG helps:

RAG pulls the latest data from sources like:

LLMs can only “see” a limited amount of text at once (context window, like 8k–128k tokens). If you try to cram too much, important parts get cut off.

🛠 How RAG helps:

Instead of sending the whole database or document, RAG:

common and useful ways to chunk documents in a RAG system:

Chunking Method	Best For
Fixed-Length	Quick start, short texts
Sliding Window	Better context in chunks
Sentence-Based	Articles, readable content
Paragraph-Based	Manuals, essays
Semantic Chunking	When high accuracy is needed
Header-Based	Structured docs, technical content
Tokenizer-Based	LLM-ready chunks by default

it is a vector (list of number) that represent meaning of word, sentence, paragraph in way that machine can understand.
```
  "happy" → [0.21, -0.11, 0.56, ...]
```

it stores the vectors of embedded chunks
Later, given a query, find the most similar chunks based on semantic meaning, not keywords

That query is turned into a vector using an embedding model.
Vector Search: The vector is compared to a set of document embeddings stored in a vector database and most similar documents (top-k) are retrieved based on cosine similarity.
The smaller the distance or higher the similarity → the more relevant the chunk.

After retrieving the most relevant chunks using semantic search, the generation phase uses those chunks as context to answer the user’s question.
```
  User Question + Retrieved Chunks → Prompt → GPT → Final Answer
```

Learned something? Hit the ❤️ to say “thanks!” and help others discover this article.

Check out my blog for more things related GenAI