Recursive Language Models
Recursive language models (RLM) have been all the rage lately.
I love this paper. It’s such a clever idea to make the input prompt a Python variable that the model is able to slice and recursively call itself on relevant chunks of the long input. Why do we need this?
LLMs have a fixed context window, which limits how much text they can directly process at once. Very long sequences lead to degraded performance (“context rot”). The current state of the art present in Claude Code and Codex is to “compact” the context once it grows beyond the context window of the model. The compact operator essentially summarizes all context up to this point. This compression of course leads to information loss that can be downright frustrating, especially if you have subagents working on a detailed plan you just spent 30 minutes polishing and perfecting.
brainqub3/claude_code_RLM exposes RLM for users of Claude Code with the /rlm model context protocol tool. Once launched, you will be prompted to provide the long context document you want to apply RLM to and the query you have. I typically use form:
context=/path/to/my/gigantic/file
query=Find the needle in the haystack!
What does this capability unlock? It’s an invaluable resource for implementing complex features that require a plan larger than can fit in context.
I recently made a photography portfolio website https://developedforgood.com and decided to write the backend in Rust and the frontend in React. I’m pretty novice at Rust and this was a complex task that required integration with S3, Stripe, pledge.to, React, and TailwindCSS Plus to make the app responsive across mobile, desktop, and tablet browsers. This project touched UI/UX design, state management, security considerations, and code optimization to get the site to run on my tiny $4.50 / month fly.io instance.
I went back and forth with Claude for 45 minutes making this comprehensive plan that ended up being 98kb on disk 😭 - way too big for Opus 4.5’s context window. Enter claude_code_RLM !
I provided the 98kb plan to the tool and told it to init a claude-flow swarm to use test driven development to burn down the massive Todo list outlined in the path. Two hours and a few manual interventions to get the CI/CD pipeline to pass and voila! One portfolio website to display my photos (feel free to buy one 😉). Not mentioned is the extra three weeks it took to go from 90% -> 99%. This is pretty typical in software development, especially full-stack.
The /rlm tool is an invaluable addition to the toolkit when you have design plans that are just too large for the default context window (most things worth building are). Combined with an agentic swarm and the happy app and I can scheme up elaborate designs and have agents knock it out without me being tethered to my laptop.
The limit is now what you are capable of imagining.
$200 / month for Claude Max Pro is such a steal once you realize this.
Edit 2026-02-21:
I can’t confirm it, but I am fairly certain that the announcement of Opus 4.6 integrates RLM into Opus! The new million-token context window and the claim that “Opus 4.6 performs markedly better than its predecessors: on the 8-needle 1M variant of MRCR v2—a needle-in-a-haystack benchmark that tests a model’s ability to retrieve information “hidden” in vast amounts of text—Opus 4.6 scores 76%, whereas Sonnet 4.5 scores just 18.5%” is a dead giveaway that RLM is in use. The skill is still a useful reference for users of cline.