Simple MapReduceDocumentsChain with token_max & collapse_documents_chain

Learn to use custom collapse chain for saving tokens.

Whether you’re summarising customer reviews, extracting insights from lengthy reports, or analyzing research papers, handling document processing at scale requires thoughtful approaches to manage context limitations. This is where LangChain’s MapReduceDocumentsChain shines as a powerful solution.

Full code snippet at the end 🚀

Understanding the MapReduce Pattern in LangChain

If you’ve worked with large language models (LLMs), you’re likely familiar with their context window limitations. These models can only process a certain number of tokens at once, creating challenges when dealing with extensive document collections. The MapReduceDocumentsChain in LangChain elegantly addresses this constraint by implementing the classic MapReduce pattern from distributed computing.

But how exactly does this work? Let’s break it down:

  1. Map Phase: Each document is processed individually by an LLM, creating intermediate outputs
  2. Reduce Phase: These intermediate outputs are combined (potentially in multiple stages) to produce a final result

This approach allows us to process documents that collectively would exceed the context window of our LLM, making it an essential tool in any AI developer’s toolkit.

Here is a comprehensive guide to cost and token limits considering the Cost of Top 10 LLMs – Compare & Find the Best Budget Option

A Real-World Example: Analysing Customer Reviews

To demonstrate the power of LangChain’s MapReduceDocumentsChain, I’ll walk through a practical example analysing smartphone customer reviews using Google’s Gemini model. This example showcases not only basic MapReduce functionality but also the critical token_max parameter and collapse chain mechanisms that prevent context window errors.

Let’s look at the implementation step by step:

MapReduceDocumentsChain

Setting Up the Environment

First, we need to import the necessary libraries and set up our LLM:

You can get your google gemini key for free from https://aistudio.google.com/apikey

Next, we prepare our sample customer reviews:

Configuring the Map Chain

The Map phase processes each document individually. For our review analysis, we’ll extract positive and negative aspects from each review. Here each review will be considered as a document:

Setting Up the Reduce and Collapse Chains

Here’s where things get interesting! The Reduce chain combines the mapped outputs, while the Collapse chain serves as a fallback mechanism when token limits are exceeded:

The Power of token_max in Managing Context Windows

Now, let’s focus on one of the most critical parameters in the ReduceDocumentsChaintoken_max. This parameter is fundamental to preventing context window errors when working with large document collections.

The token_max parameter specifies the maximum number of tokens to include in each group before the reduce operation. In our example, we’ve set it to 200 tokens, which means:

  1. Process each customer review individually using the map chain
  2. Group the mapped results according to the token_max parameter. If all the individual review summaries fit into 1 context where token size < token_max then the combine chain will be called for the final result. Note that collapse chain was not triggered here.
  3. If grouping all the mapped summaries into one exceeds the token_max then the group is broken into smaller groups where each group respects the token_max.
  4. At this point each group will be summarised using the collapse chain. Once done, the flow repeats from Step #2.
  5. Lastly at some point, multiple summaries would fit into the 1 context where token size < token_max and then the combine chain will be called for a final result. In langchain documentation, they call it “recursive collapsing”, meaning collapse until it first the token_max window.

This approach is crucial when dealing with large document collections or when generating verbose intermediate results. Without token_max, you might encounter:

  • Context window errors: When intermediate results collectively exceed the LLM’s token limit
  • Processing failures: When the system attempts to process too much text at once
  • Performance degradation: Due to inefficient use of the LLM’s context window

Setting an appropriate token_max value requires balancing between processing efficiency (fewer reduce operations) and context window constraints. For most applications, starting with a value at 25-50% of your LLM’s context window is recommended, adjusting based on the verbosity of your intermediate results.

The Collapse Chain: Your Safety Net Against Context Overflows

While token_max helps manage the initial grouping of documents, what happens when even the reduced results collectively exceed your model’s context window? This is where the collapse chain comes into play.

The collapse chain is essentially a fallback mechanism that further condenses information when necessary. In our example, we’ve configured it to create more concise(short/less-tokens) summaries while preserving the essential information:

Benefits of implementing a collapse chain include:

  1. Graceful degradation: Instead of failing when document collections are too large, the system continues processing by condensing information
  2. Adaptability to varying document sizes: Your pipeline can handle both small and large document sets without modification
  3. Preservation of critical information: With a well-designed collapse prompt, you ensure that key insights are retained even with extreme compression

Think of the collapse chain as your insurance policy against context window errors. Even if your initial estimates for token_max prove insufficient, the collapse chain ensures your processing pipeline remains robust.

Putting It All Together: The Complete MapReduceDocumentsChain

Now that we understand the individual components, let’s see how they come together in the final MapReduceDocumentsChain:

When executed, this chain will:

  1. Process each customer review individually using the map chain
  2. Group the mapped results according to the token_max parameter
  3. Reduce each group using the reduce chain
  4. If necessary, use the collapse chain to further condense results (if grouping all the mapped summaries exceed the token_max)
  5. Produce a final summary of customer sentiment

Is collapse_documents_chain optional?

Yes and if you don’t specify it in the ReduceDocumentsChain, the collapse logic will happen with your combine_documents_chain itself. See the description from langchain’s documentation:

Best Practices for Working with MapReduceDocumentsChain

After implementing numerous document processing pipelines with LangChain’s MapReduceDocumentsChain, I’ve developed several best practices that might help you:

1. Carefully Tune Your token_max Parameter

The token_max parameter is crucial for preventing context window errors. Consider these guidelines:

  • Start with a conservative value (25-50% of your LLM’s context window)
  • Account for the verbosity of your intermediate results
  • Test with representative document samples to find the optimal value
  • Monitor token usage during processing to identify potential bottlenecks

Remember that setting token_max too low leads to excessive reduce operations, while setting it too high risks context window errors.

2. Design Effective Map and Reduce Prompts

Your prompts significantly impact the quality and efficiency of your MapReduce process:

  • Map prompts should extract only the information needed for your final output
  • Reduce prompts should effectively consolidate information without excessive verbosity
  • Collapse prompts should preserve essential information while aggressively reducing token count

Consider the information flow across your entire pipeline when designing these prompts.

3. Implement Robust Error Handling

Despite our best efforts with token_max and collapse chains, errors can still occur:

  • Wrap your MapReduce execution in appropriate try-except blocks
  • Implement fallback mechanisms for particularly challenging document sets
  • Consider document preprocessing (e.g., chunking very large documents) before entering the MapReduce pipeline

4. Monitor Performance and Adjust as Needed

Document processing at scale requires ongoing monitoring:

  • Track token usage across different stages of your pipeline
  • Identify processing bottlenecks and adjust parameters accordingly
  • Consider parallel processing for the map phase when working with very large document collections

Here is the full code snippet enjoy !

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *