Skip to main content

Command Palette

Search for a command to run...

Query Translation (Query Re-writing) [Code Explanation]

Updated
7 min read
Query Translation (Query Re-writing) [Code Explanation]
O
Full-stack Ai Engineer

This articles demonstrates three advanced Query Transformation techniques—Multi-Query, RAG Fusion, and HyDE—used to improve document retrieval in a RAG pipeline using LangChain and Google Generative AI.

if u don’t know about Query Re-writing visit our articles, here u learn what is Query Re-writing and their methods

Setup and Initialization

  1. Setup .env file

    GROQ_API_KEY = [Your groq api keys for chat-models]
    GOOGLE_API_KEY = [Your google api keys for embed-models]
    QDRANT_URL = http://localhost:6333
    
  2. Libraries required

    pnpm install @langchain/community @langchain/core @langchain/google-genai @langchain/groq @langchain/qdrant  langchain 
    
  3. Setup app.js file

First, we set up the environment, load models, and configure the connection to the vector store (Qdrant) and the embedding model.

import { ChatGroq } from "@langchain/groq";
import { GoogleGenerativeAIEmbeddings } from "@langchain/google-genai";
import { QdrantVectorStore } from "@langchain/qdrant";
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';


// --- LLM and Embedding setup ---

// LLM for final answer synthesis
const model = new ChatGroq({
  model: "openai/gpt-oss-20b",
});

// LLM for better hypothetical document quality
const large_parameter_model = new ChatGroq({
  model: "openai/gpt-oss-120b",
});

// Embedding model for converting text to vector space
const embeddings = new GoogleGenerativeAIEmbeddings({
  model: "gemini-embedding-001",
});

// --- Vector store and retriever setup ---

// Connect to the existing Qdrant collection
const vectorStore = await QdrantVectorStore.fromExistingCollection(embeddings, {
  url: process.env.QDRANT_URL,
  collectionName: "genai-toolkit",
});

// Create a basic retriever instance
const retriever = vectorStore.asRetriever()

1. Multi-Query Retrieval (Parallel Retrieval)

here is complete GitHub code of Multi-Query Retrieval

1.1. Query Generation and De-Duplication Logic

// --- Multi-query generation prompt ---
const multiQueryTemplate = `
You are a helpful assistant that generates multiple search queries based on a single input query.
Generate multiple search queries (3 queries) related to: {question}
Output (only generate queries):
query1
query2
query3
`;
const multiQueryPrompt = ChatPromptTemplate.fromTemplate(multiQueryTemplate);

// --- json schema for strutured output ---
const jsonSchema = {
  title: "queries",
  type: "object",
  properties: {
    Query1: { type: "string", description: "Query1 of user input" },
    Query2: { type: "string", description: "Query2 of user input" },
    Query3: { type: "string", description: "Query3 of user input" },
  },
  required: ["Query1", "Query2", "Query3"],
};

const modelWithStruture = model.withStructuredOutput(jsonSchema, {
  method: "jsonSchema",
});

// --- Query Generation Chain (LLM-1) ---
const generateQueriesChain = multiQueryPrompt
  .pipe(modelWithStruture)
  .pipe((output) => Object.values(output));

// --- Utility: Unique union of retrieved documents (Context Merging) ---

function getUniqueUnion(documents) {
  const flattenedDocs = documents.flat();
  
  // Use a Map to track unique content. 
  // If a duplicate pageContent appears, the Map just overwrites the entry,
  // effectively keeping only one instance.
  const uniqueMap = new Map();
  
  flattenedDocs.forEach(doc => {
    // You can use doc.pageContent or a specific metadata field here
    uniqueMap.set(doc.pageContent, doc);
  });

  return Array.from(uniqueMap.values());
}

// --- Retrieval Chain ---
const retrievalChain = generateQueriesChain
  .pipe(async (queries) => {
    const results = await Promise.all(
      queries.map((query) => retriever.invoke(query))
    );
    return results;
  })
  .pipe((documents) => getUniqueUnion(documents));

1.2. Final Execution

// --- Final Multi-Query RAG chain: retrieve context, format prompt, get answer from LLM ---
const answerTemplate = `Answer the following question based on this context:
{context}
Question: {question}
`;
const answerPrompt = ChatPromptTemplate.fromTemplate(answerTemplate);

// final RAG chain
const finalRagChain = async (input) => {
  const context = await retrievalChain.invoke(input);
  const promptValue = await answerPrompt.invoke({
    context: JSON.stringify(context),
    question: input.question,
  });
  const answer = await model.invoke(promptValue);
  const output = await new StringOutputParser().invoke(answer);
  return output;
};

// --- Run the chain ---
(async () => {
  const userQuestion = 'what is AI generalist';
  const answer = await finalRagChain({ question: userQuestion });
  console.log('Final answer:', answer);
})();

2. RAG Fusion (Multi-Query + RRF)

here is complete GitHub code of RAG Fusion

2.1. RRF Logic

We define the RRF function, which replaces the simple de-duplication step. It calculates a fused score for every document across all retrieved lists.

// --- Reciprocal Rank Fusion Function (Context Merging with Re-ranking) ---

function reciprocalRankFusion(results, k = 60) {
  const fusedScores = {};

  results.forEach((docs) => {
    docs.forEach((doc, rank) => {
      // Use JSON.stringify as an equivalent to dumps
      // NOTE: If your documents have different IDs but same content,
      // you might want to stringify ONLY the pageContent.
      const docStr = JSON.stringify(doc);

      if (!(docStr in fusedScores)) {
        fusedScores[docStr] = 0;
      }

      // RRF Formula: 1 / (rank + k)
      // rank is 0-indexed here
      fusedScores[docStr] += 1 / (rank + k);
    });
  });

  // Convert the object entries to an array, sort by score descending, and parse back
  const rerankedResults = Object.entries(fusedScores)
    .sort((a, b) => b[1] - a[1]) // Sort by the score (index 1)
    .map(([docStr, score]) => {
      return {
        document: JSON.parse(docStr),
        score: score,
      };
    });

  return rerankedResults;
}

// --- Retrieval Chain (Parallel Retrieval + RRF) ---
// Note: The 'generateQueriesChain from the Multi-Query section is reused here.

const retrievalChain = generateQueriesChain
  .pipe(async (queries) => {
    const results = await Promise.all(
      queries.map((query) => retriever.invoke(query)),
    );
    return results;
  })
  .pipe((documents) => reciprocalRankFusion(documents));

2.2. Final Execution

// --- Prompt for final answer ---
const answerTemplate = `Answer the following question based on this context:
{context}
Question: {question}`;

const answerPrompt = ChatPromptTemplate.fromTemplate(answerTemplate);

// --- Final RAG Fusion RAG chain: retrieve context, format prompt, get answer from LLM ---
const finalRagChain = async (input) => {
  const context = await retrievalChain.invoke(input);
  const promptValue = await answerPrompt.invoke({
    context: JSON.stringify(context),
    question: input.question,
  });
  const answer = await model.invoke(promptValue);
  const output = await new StringOutputParser().invoke(answer);
  return output;
};

// --- Run the chain ---
(async () => {
  const userQuestion = "what is AI generalist";
  const answer = await finalRagChain({ question: userQuestion });
  console.log("Final answer:", answer);
})();

3.Hypothetical Document Embeddings (HyDE)

here is complete GitHub code of Hypothetical Document Embeddings (HyDE)

3.1. HyDE Generation and Retrieval Logic

We define a specific prompt to generate a detailed hypothetical document, and use a potentially larger LLM for this generation step for better detail.

// --- HyDE generation prompt (LLM-1) ---
const multiQueryTemplate = `Please write a scientific paper passage to answer the question
Question: {question}
Passage`;

const prompt_hyde = ChatPromptTemplate.fromTemplate(multiQueryTemplate);

// Use a potentially larger LLM for better hypothetical document quality
const large_parameter_model = new ChatGroq({
  model: "openai/gpt-oss-120b",
});

// --- Hypothetical Document Generation Chain ---
const generate_docs_for_retrieval = prompt_hyde
  .pipe(large_parameter_model)
  .pipe(async (answer) => await new StringOutputParser().invoke(answer));

// --- Retrieval Chain (HyDE document -> Embedding -> Retrieval) ---

const retrieval_chain_hyde = generate_docs_for_retrieval.pipe(
  async (hyde_doc) => await retriever.invoke(hyde_doc),
);

3.2. Final Execution

// --- Prompt for final answer ---
const answerTemplate = `Answer the following question based on this context:
{context}
Question: {question}
`;
const answerPrompt = ChatPromptTemplate.fromTemplate(answerTemplate);

// --- Final HyDE RAG chain: retrieve context, format prompt, get answer from LLM ---
const finalRagChain = async (input) => {
  const context = await retrieval_chain_hyde.invoke(input);
  const promptValue = await answerPrompt.invoke({
    context: JSON.stringify(context),
    question: input.question,
  });
  const answer = await model.invoke(promptValue);
  const output = await new StringOutputParser().invoke(answer);
  return output;
};

// --- Run the chain ---
(async () => {
  const userQuestion = "what is AI generalist";
  const answer = await finalRagChain({ question: userQuestion });
  console.log("Final answer:", answer);
})();

Resources

1.Github code

2.Articles

More from this blog

Onkar K | Full-Stack AI Engineering

19 posts

Production-grade GenAI & multi-agent apps with Next.js & TypeScript. Explore deep architectures using LangGraph.js, LangChain.js, and backends via Hono, Express, & Node.js. Master advanced RAG with Qdrant, Pinecone, and Redis caching. Track execution with Langfuse and LangSmith. Zero fluff—just type-safe code, terminal logs, and robust deployments with Docker, Kafka, and Kubernetes for modern builders