This articles demonstrates three advanced Query Transformation techniques—Multi-Query, RAG Fusion, and HyDE—used to improve document retrieval in a RAG pipeline using LangChain and Google Generative AI.

if u don’t know about Query Re-writing visit our articles, here u learn what is Query Re-writing and their methods

Setup and Initialization

Setup .env file

GROQ_API_KEY = [Your groq api keys for chat-models]
GOOGLE_API_KEY = [Your google api keys for embed-models]
QDRANT_URL = http://localhost:6333

Libraries required

pnpm install @langchain/community @langchain/core @langchain/google-genai @langchain/groq @langchain/qdrant  langchain

Setup app.js file

First, we set up the environment, load models, and configure the connection to the vector store (Qdrant) and the embedding model.

import { ChatGroq } from "@langchain/groq";
import { GoogleGenerativeAIEmbeddings } from "@langchain/google-genai";
import { QdrantVectorStore } from "@langchain/qdrant";
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';


// --- LLM and Embedding setup ---

// LLM for final answer synthesis
const model = new ChatGroq({
  model: "openai/gpt-oss-20b",
});

// LLM for better hypothetical document quality
const large_parameter_model = new ChatGroq({
  model: "openai/gpt-oss-120b",
});

// Embedding model for converting text to vector space
const embeddings = new GoogleGenerativeAIEmbeddings({
  model: "gemini-embedding-001",
});

// --- Vector store and retriever setup ---

// Connect to the existing Qdrant collection
const vectorStore = await QdrantVectorStore.fromExistingCollection(embeddings, {
  url: process.env.QDRANT_URL,
  collectionName: "genai-toolkit",
});

// Create a basic retriever instance
const retriever = vectorStore.asRetriever()

1. Multi-Query Retrieval (Parallel Retrieval)

here is complete GitHub code of Multi-Query Retrieval

1.1. Query Generation and De-Duplication Logic

// --- Multi-query generation prompt ---
const multiQueryTemplate = `
You are a helpful assistant that generates multiple search queries based on a single input query.
Generate multiple search queries (3 queries) related to: {question}
Output (only generate queries):
query1
query2
query3
`;
const multiQueryPrompt = ChatPromptTemplate.fromTemplate(multiQueryTemplate);

// --- json schema for strutured output ---
const jsonSchema = {
  title: "queries",
  type: "object",
  properties: {
    Query1: { type: "string", description: "Query1 of user input" },
    Query2: { type: "string", description: "Query2 of user input" },
    Query3: { type: "string", description: "Query3 of user input" },
  },
  required: ["Query1", "Query2", "Query3"],
};

const modelWithStruture = model.withStructuredOutput(jsonSchema, {
  method: "jsonSchema",
});

// --- Query Generation Chain (LLM-1) ---
const generateQueriesChain = multiQueryPrompt
  .pipe(modelWithStruture)
  .pipe((output) => Object.values(output));

// --- Utility: Unique union of retrieved documents (Context Merging) ---

function getUniqueUnion(documents) {
  const flattenedDocs = documents.flat();
  
  // Use a Map to track unique content. 
  // If a duplicate pageContent appears, the Map just overwrites the entry,
  // effectively keeping only one instance.
  const uniqueMap = new Map();
  
  flattenedDocs.forEach(doc => {
    // You can use doc.pageContent or a specific metadata field here
    uniqueMap.set(doc.pageContent, doc);
  });

  return Array.from(uniqueMap.values());
}

// --- Retrieval Chain ---
const retrievalChain = generateQueriesChain
  .pipe(async (queries) => {
    const results = await Promise.all(
      queries.map((query) => retriever.invoke(query))
    );
    return results;
  })
  .pipe((documents) => getUniqueUnion(documents));

1.2. Final Execution

// --- Final Multi-Query RAG chain: retrieve context, format prompt, get answer from LLM ---
const answerTemplate = `Answer the following question based on this context:
{context}
Question: {question}
`;
const answerPrompt = ChatPromptTemplate.fromTemplate(answerTemplate);

// final RAG chain
const finalRagChain = async (input) => {
  const context = await retrievalChain.invoke(input);
  const promptValue = await answerPrompt.invoke({
    context: JSON.stringify(context),
    question: input.question,
  });
  const answer = await model.invoke(promptValue);
  const output = await new StringOutputParser().invoke(answer);
  return output;
};

// --- Run the chain ---
(async () => {
  const userQuestion = 'what is AI generalist';
  const answer = await finalRagChain({ question: userQuestion });
  console.log('Final answer:', answer);
})();

2. RAG Fusion (Multi-Query + RRF)

here is complete GitHub code of RAG Fusion

2.1. RRF Logic

We define the RRF function, which replaces the simple de-duplication step. It calculates a fused score for every document across all retrieved lists.

// --- Reciprocal Rank Fusion Function (Context Merging with Re-ranking) ---

function reciprocalRankFusion(results, k = 60) {
  const fusedScores = {};

  results.forEach((docs) => {
    docs.forEach((doc, rank) => {
      // Use JSON.stringify as an equivalent to dumps
      // NOTE: If your documents have different IDs but same content,
      // you might want to stringify ONLY the pageContent.
      const docStr = JSON.stringify(doc);

      if (!(docStr in fusedScores)) {
        fusedScores[docStr] = 0;
      }

      // RRF Formula: 1 / (rank + k)
      // rank is 0-indexed here
      fusedScores[docStr] += 1 / (rank + k);
    });
  });

  // Convert the object entries to an array, sort by score descending, and parse back
  const rerankedResults = Object.entries(fusedScores)
    .sort((a, b) => b[1] - a[1]) // Sort by the score (index 1)
    .map(([docStr, score]) => {
      return {
        document: JSON.parse(docStr),
        score: score,
      };
    });

  return rerankedResults;
}

// --- Retrieval Chain (Parallel Retrieval + RRF) ---
// Note: The 'generateQueriesChain from the Multi-Query section is reused here.

const retrievalChain = generateQueriesChain
  .pipe(async (queries) => {
    const results = await Promise.all(
      queries.map((query) => retriever.invoke(query)),
    );
    return results;
  })
  .pipe((documents) => reciprocalRankFusion(documents));

2.2. Final Execution

// --- Prompt for final answer ---
const answerTemplate = `Answer the following question based on this context:
{context}
Question: {question}`;

const answerPrompt = ChatPromptTemplate.fromTemplate(answerTemplate);

// --- Final RAG Fusion RAG chain: retrieve context, format prompt, get answer from LLM ---
const finalRagChain = async (input) => {
  const context = await retrievalChain.invoke(input);
  const promptValue = await answerPrompt.invoke({
    context: JSON.stringify(context),
    question: input.question,
  });
  const answer = await model.invoke(promptValue);
  const output = await new StringOutputParser().invoke(answer);
  return output;
};

// --- Run the chain ---
(async () => {
  const userQuestion = "what is AI generalist";
  const answer = await finalRagChain({ question: userQuestion });
  console.log("Final answer:", answer);
})();

3.Hypothetical Document Embeddings (HyDE)

here is complete GitHub code of Hypothetical Document Embeddings (HyDE)

3.1. HyDE Generation and Retrieval Logic

We define a specific prompt to generate a detailed hypothetical document, and use a potentially larger LLM for this generation step for better detail.

// --- HyDE generation prompt (LLM-1) ---
const multiQueryTemplate = `Please write a scientific paper passage to answer the question
Question: {question}
Passage`;

const prompt_hyde = ChatPromptTemplate.fromTemplate(multiQueryTemplate);

// Use a potentially larger LLM for better hypothetical document quality
const large_parameter_model = new ChatGroq({
  model: "openai/gpt-oss-120b",
});

// --- Hypothetical Document Generation Chain ---
const generate_docs_for_retrieval = prompt_hyde
  .pipe(large_parameter_model)
  .pipe(async (answer) => await new StringOutputParser().invoke(answer));

// --- Retrieval Chain (HyDE document -> Embedding -> Retrieval) ---

const retrieval_chain_hyde = generate_docs_for_retrieval.pipe(
  async (hyde_doc) => await retriever.invoke(hyde_doc),
);

3.2. Final Execution

// --- Prompt for final answer ---
const answerTemplate = `Answer the following question based on this context:
{context}
Question: {question}
`;
const answerPrompt = ChatPromptTemplate.fromTemplate(answerTemplate);

// --- Final HyDE RAG chain: retrieve context, format prompt, get answer from LLM ---
const finalRagChain = async (input) => {
  const context = await retrieval_chain_hyde.invoke(input);
  const promptValue = await answerPrompt.invoke({
    context: JSON.stringify(context),
    question: input.question,
  });
  const answer = await model.invoke(promptValue);
  const output = await new StringOutputParser().invoke(answer);
  return output;
};

// --- Run the chain ---
(async () => {
  const userQuestion = "what is AI generalist";
  const answer = await finalRagChain({ question: userQuestion });
  console.log("Final answer:", answer);
})();

Query Translation (Query Re-writing) [Code Explanation]

Setup and Initialization

1. Multi-Query Retrieval (Parallel Retrieval)

1.1. Query Generation and De-Duplication Logic

1.2. Final Execution

2. RAG Fusion (Multi-Query + RRF)

2.1. RRF Logic

2.2. Final Execution

3.Hypothetical Document Embeddings (HyDE)

3.1. HyDE Generation and Retrieval Logic

3.2. Final Execution

Resources

1.Github code

2.Articles

Comments

GenAI

Query Translation (Query Decomposition) [Code Explanation]

More from this blog

How To Build MCP Server from Scratch with TypeScript and Groq

How to Build MCP Client from Scratch with TypeScript and Groq

Model Context Protocol (MCP)

Token Based Auth System [state-less]

Session Based Auth System [state-full]

Command Palette

Setup and Initialization

1. Multi-Query Retrieval (Parallel Retrieval)

1.1. Query Generation and De-Duplication Logic

1.2. Final Execution

2. RAG Fusion (Multi-Query + RRF)

2.1. RRF Logic

2.2. Final Execution

3.Hypothetical Document Embeddings (HyDE)

3.1. HyDE Generation and Retrieval Logic

3.2. Final Execution

Resources

1.Github code

2.Articles

Comments

GenAI

Query Translation (Query Decomposition) [Code Explanation]

More from this blog