This articles demonstrates two advanced Query Decomposition techniques—Parallel and Iterative decomposition—used to improve document retrieval in a RAG pipeline using LangChain and Google Generative AI.

if u don’t know about Query Decomposition visit our articles, here u learn what is Query Decomposition and their methods.

Setup and Initialization

Setup .env file

GROQ_API_KEY = [Your groq api keys for chat-models]
GOOGLE_API_KEY = [Your google api keys for embed-models]
QDRANT_URL = http://localhost:6333

Libraries required

pnpm install @langchain/community @langchain/core @langchain/google-genai @langchain/groq @langchain/qdrant  langchain

Setup app.js file
First, we set up the environment, load models, and configure the connection to the vector store (Qdrant) and the embedding model.

import { ChatGroq } from "@langchain/groq";
import { GoogleGenerativeAIEmbeddings } from "@langchain/google-genai";
import { QdrantVectorStore } from "@langchain/qdrant";
import { ChatPromptTemplate } from '@langchain/core/prompts';
import { StringOutputParser } from '@langchain/core/output_parsers';


// --- LLM and Embedding setup ---

// LLM for final answer synthesis
const model = new ChatGroq({
  model: "openai/gpt-oss-20b",
});

// Embedding model for converting text to vector space
const embeddings = new GoogleGenerativeAIEmbeddings({
  model: "gemini-embedding-001",
});

// --- Vector store and retriever setup ---

// Connect to the existing Qdrant collection
const vectorStore = await QdrantVectorStore.fromExistingCollection(embeddings, {
  url: process.env.QDRANT_URL,
  collectionName: "genai-toolkit",
});

// Create a basic retriever instance
const retriever = vectorStore.asRetriever()

1. Parallel Query Decomposition

here is complete GitHub source code

1. Prompt

templateDecomposition- a prompt instruction that tells llm to break down complex questions into 3 sub-questions
promptDecomposition - templateDecomposition it wraps around promptDecomposition to use it in chain

// --- Query-decomposition generation prompt ---
const templateDecomposition = `You are a helpful assistant that generates multiple sub-questions related to an input question. \n
The goal is to break down the input into a set of sub-problems / sub-questions that can be answers in isolation. \n
Generate multiple search queries related to: {question} \n
note: generate only queries
Output (only 3 queries):`;

const promptDecomposition = ChatPromptTemplate.fromTemplate(
  templateDecomposition,
);

2. Structured Output

Here we using querySchema to describe what output should want from model

We pass that querySchema to model using withStructuredOutput method.

const querySchema = {
  title: "queries",
  type: "object",
  properties: {
    Query1: { type: "string", description: "Query1 of user input" },
    Query2: { type: "string", description: "Query2 of user input" },
    Query3: { type: "string", description: "Query3 of user input" },
  },
  required: ["Query1", "Query2", "Query3"],
};

const modelWithStruture = model.withStructuredOutput(querySchema, {
  method: "jsonSchema",
});

3. Query Generation Chain

a chain do following things

promptDecomposition - take input prompts.
modelWithStruture - run it through to get strutured output.

(output) => Object.values(output) - convert modelWithStruture output into array of values

// --- Query Generation Chain ---
const generateQueriesChain = promptDecomposition
  .pipe(modelWithStruture)
  .pipe((output) => Object.values(output));

4. Prompt for Sub-question Answering

reterivedTemplate - Prompt instruction to answer sub-question using retrieved context.
reterivedPrompt - it is used in chain

// --- Retrieved sub-question answer generation prompt ---
const retrievedTemplate = `
You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences.
Question: {question}
Context: {context}
Answer:
`;

const retrievedPrompt = ChatPromptTemplate.fromTemplate(retrievedTemplate);

5. Sub-question Retrieval and Answer Generation

1. Q&A Pair Formatting Utility. `formatQAPairs`

Formats each sub-question and its answer into a readable string, ready to be used as context for the final answer.

2. Sub-question Retrieval and Answer Generation. `retrieveAndRag`

retrieveAndRag this is main function that do followings things:

subQuestions Decompose the main question into sub-questions using queryGenration.
For each subQuestion:
1. Retrieve relevant context from the vector store.
2. Run a chain that feeds the sub-question and its context to the LLM to get an answer.
3. Collect all answers.
Format all Q&A pairs into a single context string using formatQAPairsutility.

// --- Utility: context merging sub-question with answers ---
function formatQAPairs(questions, answers) {
  /**
   * Format Q and A pairs
   */
  let formattedString = "";
  for (let i = 0; i < questions.length; i++) {
    formattedString += `Question \({i + 1}: \){questions[i]}\nAnswer \({i + 1}: \){answers[i]}\n\n`;
  }
  return formattedString.trim();
}

// --- Utility: Sub-questions and their answers ---
async function retrieveAndRag(question, queryGeneration, retrievedPrompt) {
  /**
   * RAG on each sub-question
   */
  const subQuestions = await queryGeneration.invoke({ question: question });

  const ragResults = [];

  for (const subQuestion of subQuestions) {
    // Create a chain for each sub-question
    const chain = retrievedPrompt.pipe(model).pipe(new StringOutputParser());

    // Retrieve context for the sub-question
    const context = await retriever.invoke(subQuestion);

    // Format context
    const contextString = context.map((doc) => doc.pageContent).join("\n");

    const ans = await chain.invoke({
      question: subQuestion,
      context: contextString,
    });

    ragResults.push(ans);
  }

  const context = formatQAPairs(subQuestions, ragResults);
  return context;
}

6. Final Synthesis Prompt and Chain

prompt template - prompt instruction for llm to answer the question by using all the Q&A pairs as context.
finalRagChain- run a prompt through llm and parse the output.
Execution - Runs the final synthesis chain to get a concise, synthesized answer.

// --- Main execution ---
async function main() {
  const question = "what AI generalist";
  const context = await retrieveAndRag(
    question,
    generateQueriesChain,
    retrievedPrompt,
  );

  // --- Prompt for final answer ---
  const template = `Here is a set of Q+A pairs:
    {context}
    Use these to synthesize an answer to the question: {question}
    `;

  const prompt = ChatPromptTemplate.fromTemplate(template);

  // --- Final RAG chain ---
  const finalRagChain = prompt.pipe(model).pipe(new StringOutputParser());

  // --- Run the chain ---
  const finalAns = await finalRagChain.invoke({
    context: context,
    question: question,
  });

  console.log("Final answer:", finalAns);
}

// Execute main function
main().catch(console.error);

2. Iterative Query Decomposition

here is complete GitHub source code

1. Prompt

💡

this is same step as we discussed in Parallel Query Decomposition

templateDecomposition - a prompt instruction that tells llm to break down complex questions into 3 sub-questions
promptDecomposition - templateDecomposition it wraps around promptDecomposition to use it in chain

// --- Query-decomposition generation prompt ---
const templateDecomposition = `You are a helpful assistant that generates multiple sub-questions related to an input question.
The goal is to break down the input into a set of sub-problems / sub-questions that can be answered in isolation.
Generate multiple search queries related to: {question}
note: generate only queries
Output (only 3 queries):`;

const promptDecomposition = ChatPromptTemplate.fromTemplate(
  templateDecomposition,
);

2. Structured Output

Here we using querySchema to describe what output should want from model

We pass that querySchema to model using withStructuredOutput method.

const querySchema = {
  title: "queries",
  type: "object",
  properties: {
    Query1: { type: "string", description: "Query1 of user input" },
    Query2: { type: "string", description: "Query2 of user input" },
    Query3: { type: "string", description: "Query3 of user input" },
  },
  required: ["Query1", "Query2", "Query3"],
};

const modelWithStruture = model.withStructuredOutput(querySchema, {
  method: "jsonSchema",
});

3. Query Generation Chain

💡

this is also same step as we discussed in Parallel Query Decomposition

a chain do following things

promptDecomposition - take input prompts.
modelWithStruture - run it through to get strutured output.
(output) => Object.values(output) - convert modelWithStruture output into array of values

// --- Query Generation Chain ---
const queryGeneration = promptDecomposition
  .pipe(modelWithStruture)
  .pipe((output) => Object.values(output));

4. Prompt for Sub-question Answering

A prompt template that asks the LLM to answer a sub-question using:

The sub-question itself,
Any background Q&A pairs,
Additional retrieved context.

// --- Retrieved sub-question answer generation prompt ---
const retrievedTemplate = `Here is the question you need to answer:

--- 
{question}
--- 

Here is any available background question + answer pairs:

--- 
{q_a_pairs}
--- 

Here is additional context relevant to the question:

--- 
{context}
--- 

Use the above context and any background question + answer pairs to answer the question:
{question}`;

const retrievedPrompt = ChatPromptTemplate.fromTemplate(retrievedTemplate);

5. Sub-question Retrieval and Answer Generation

1. Q&A Pair Formatting Utility. `formatQAPairs`

Formats a subQuestion and its answer into a readable string.

2. Sub-question Retrieval and Answer Generation. `retrieveAndRag`

retrieveAndRag this is main function that do followings things:

subQuestions Decompose the main question into sub-questions using queryGenration.
For each subQuestion:
1. Retrieve relevant context from the vector store.
2. Run a chain that feeds the sub-question, any background Q&A pairs, and the retrieved context to the LLM to get an answer.
3. Format the Q&A pair and accumulate it.
Returns all Q&A pairs as a single formatted string.

// --- Utility: format Q and A pair ---
function formatQAPair(question, answer) {
  /**Format Q and A pair */
  return `Question: \({question}\nAnswer: \){answer}`;
}

// --- Each sub-question runs concurrently -> retrieve document -> generate response ---
async function retrieveAndRag(question, queryGeneration, retrievedPrompt) {
  /**RAG on each sub-question */
  const subQuestions = await queryGeneration.invoke({ question: question });

  let qAPairs = "";

  for (const subQuestion of subQuestions) {
    const chain = retrievedPrompt.pipe(model).pipe(new StringOutputParser());

    const context = await retriever.invoke(subQuestion);
    const contextString = context.map((doc) => doc.pageContent).join("\n\n");

    const ans = await chain.invoke({
      question: subQuestion,
      q_a_pairs: "",
      context: contextString,
    });

    const qAPair = formatQAPair(subQuestion, ans);
    qAPairs = qAPairs + "\n-----\n" + qAPair;
  }

  return qAPairs;
}

6. Final Synthesis Prompt and Chain

prompt template - prompt instruction for llm to answer the question by using all the Q&A pairs as context.
finalRagChain - run a prompt through llm and parse the output.
Execution - Runs the final synthesis chain to get a concise, synthesized answer.

// --- Main execution ---
async function main() {
  const question = "what AI generalist";

  // Retrieve and generate context
  const context = await retrieveAndRag(
    question,
    queryGeneration,
    retrievedPrompt,
  );

  // --- Prompt for final answer ---
  const template = `Here is a set of Q+A pairs:
{context}
Use these to synthesize an answer to the question: {question}`;

  const prompt = ChatPromptTemplate.fromTemplate(template);

  // --- Final RAG chain ---
  const finalChain = prompt.pipe(model).pipe(new StringOutputParser());

  // --- Run the chain ---
  const finalAns = await finalChain.invoke({
    context: context,
    question: question,
  });

  console.log("Final answer:", finalAns);
}

// Run the main function
main().catch(console.error);

Resources

1.Github code

2. Articles

What is Query Decomposition

Learned something? Hit the ❤️ to say “thanks!” and help others discover this article.

Check out my blog for more things related GenAI.

Query Translation (Query Decomposition) [Code Explanation]

Setup and Initialization

1. Parallel Query Decomposition

1. Prompt

2. Structured Output

3. Query Generation Chain

4. Prompt for Sub-question Answering

5. Sub-question Retrieval and Answer Generation

1. Q&A Pair Formatting Utility. `formatQAPairs`

2. Sub-question Retrieval and Answer Generation. `retrieveAndRag`

6. Final Synthesis Prompt and Chain

2. Iterative Query Decomposition

1. Prompt

2. Structured Output

3. Query Generation Chain

4. Prompt for Sub-question Answering

5. Sub-question Retrieval and Answer Generation

1. Q&A Pair Formatting Utility. `formatQAPairs`

2. Sub-question Retrieval and Answer Generation. `retrieveAndRag`

6. Final Synthesis Prompt and Chain

Resources

1.Github code

2. Articles

Comments

GenAI

Model Context Protocol (MCP)

More from this blog

How To Build MCP Server from Scratch with TypeScript and Groq

How to Build MCP Client from Scratch with TypeScript and Groq

Model Context Protocol (MCP)

Token Based Auth System [state-less]

Session Based Auth System [state-full]

Command Palette

Setup and Initialization

1. Parallel Query Decomposition

1. Prompt

2. Structured Output

3. Query Generation Chain

4. Prompt for Sub-question Answering

5. Sub-question Retrieval and Answer Generation

1. Q&A Pair Formatting Utility. formatQAPairs

2. Sub-question Retrieval and Answer Generation. retrieveAndRag

6. Final Synthesis Prompt and Chain

2. Iterative Query Decomposition

1. Prompt

2. Structured Output

3. Query Generation Chain

4. Prompt for Sub-question Answering

5. Sub-question Retrieval and Answer Generation

1. Q&A Pair Formatting Utility. formatQAPairs

2. Sub-question Retrieval and Answer Generation. retrieveAndRag

6. Final Synthesis Prompt and Chain

Resources

1.Github code

2. Articles

Comments

GenAI

Model Context Protocol (MCP)

More from this blog

1. Q&A Pair Formatting Utility. `formatQAPairs`

2. Sub-question Retrieval and Answer Generation. `retrieveAndRag`

1. Q&A Pair Formatting Utility. `formatQAPairs`

2. Sub-question Retrieval and Answer Generation. `retrieveAndRag`