caveman-plus¶
Token-saving output mode for Claude Code and 30+ other AI coding agents: it makes the agent answer in compressed "caveman" prose, cutting roughly 75% of output tokens while keeping full technical accuracy.
Install¶
Install as a Claude Code plugin:
claude plugin marketplace add 88plug/caveman-plus
claude plugin install caveman-plus@caveman-plus
Or install for every supported agent with one script:
# macOS / Linux / WSL / Git Bash
curl -fsSL https://raw.githubusercontent.com/88plug/caveman-plus/main/install.sh | bash
# Windows (PowerShell 5.1+)
irm https://raw.githubusercontent.com/88plug/caveman-plus/main/install.ps1 | iex
The script needs Node 18 or newer, takes about 30 seconds, skips agents you do not have, and is safe to re-run. Full matrix and per-agent flags are in INSTALL.md.
Quickstart¶
After install, turn it on in any session:
- Type
/caveman(or say "talk like caveman"). - Ask a question. The reply comes back compressed.
- Turn it off with "normal mode" or "stop caveman".
The result is visible in the first reply. A verbose answer like this:
The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle. When you pass an inline object as a prop, React's shallow comparison sees it as a different object every time, which triggers a re-render. I'd recommend using useMemo to memoize the object.
becomes this, with the same fix:
New object ref each render. Inline object prop = new ref = re-render. Wrap in
useMemo.
Same answer, far fewer tokens.
Note
caveman-plus changes output tokens only. Reasoning and thinking tokens are untouched, so the model is not made less capable. The main wins are readability, speed, and lower output cost.
What it does¶
caveman-plus installs a skill that instructs the agent to drop filler, pleasantries, hedging, and articles while keeping every piece of technical substance: code blocks, function names, API names, paths, and error strings stay byte-exact. You get the same correctness in a fraction of the words.
This is the 88plug edition. It ships full-plus as the default intensity, an English-only mode tuned for artifact-first replies (newest ask only, one default path, smallest usable artifact, plain prose for explainers, direct root-cause-plus-fix for bugs).
It is a fork of JuliusBrussee/caveman. Credit for the original caveman goes to Julius Brussee.
Features¶
- Compresses every reply until you turn it off; the chosen intensity persists for the session.
- Seven intensity levels, switchable with one command.
- Slash commands for commits, PR review, session stats, and memory-file compression.
- Statusline badge in Claude Code showing lifetime tokens saved.
- Caveman subagents that keep your main context window lasting longer.
- MCP middleware that compresses tool descriptions from any MCP server.
- Works across Claude Code, Codex, Gemini, Cursor, Windsurf, Cline, Copilot, and 30+ more agents.
Intensity levels¶
Switch with /caveman <level>. Levels stick until the session ends.
| Level | What it does |
|---|---|
lite |
Drop filler and hedging; keep articles and full sentences. Professional but tight. |
full |
Drop articles, allow fragments, use short synonyms. Classic caveman. |
full-plus |
English-only full tuned for artifact-first replies. 88plug edition default. |
ultra |
Abbreviate prose words, strip conjunctions, use arrows for causality, one word where one word works. Code and error strings never abbreviated. |
wenyan-lite |
Semi-classical Chinese register; keep grammar, drop filler. |
wenyan-full |
Maximum classical Chinese terseness (文言文), 80-90% character reduction. |
wenyan-ultra |
Extreme classical-Chinese abbreviation. Maximum compression. |
Auto-activation is built in for Claude Code, Codex, and Gemini. Cursor, Windsurf, Cline, and Copilot get always-on rule files via --with-init. Other agents activate per session with /caveman.
Commands and skills¶
| Command / skill | What it does |
|---|---|
/caveman [level] |
Compress every reply at the chosen intensity level. |
/caveman-commit |
Conventional Commit messages, 50-char subject, why over what. |
/caveman-review |
One-line PR comments, for example L42: bug: user null. Add guard. |
/caveman-stats |
Session token usage, lifetime savings, and USD. --share prints a one-liner. |
/caveman-compress <file> |
Rewrite a memory file (e.g. CLAUDE.md) into caveman-speak; saves input tokens every session. Code, URLs, and paths preserved byte-exact. |
caveman-shrink |
MCP middleware that wraps any MCP server and compresses tool descriptions. See npm. |
cavecrew-* |
Caveman subagents (investigator, builder, reviewer) that use fewer tokens than the defaults. |
Statusline badge
In Claude Code, the statusline shows a badge like `[CAVEMAN] 12.4k` for lifetime tokens saved. It updates on each `/caveman-stats` run. Set `CAVEMAN_STATUSLINE_SAVINGS=0` to silence it.How it works¶
- Install drops a skill file into each agent.
- The skill tells the agent to drop filler, keep substance, and use fragments.
- In Claude Code, a SessionStart hook writes a small flag file so the agent talks caveman from the first message, with no need to type
/caveman. - The stats command reads the Claude Code session log, counts tokens saved, and writes the number to the statusline.
- The
caveman-compresssub-skill rewrites memory files so each session starts with a smaller context, saving tokens on every session rather than one reply.
Maintainer detail (hook architecture, file ownership, CI sync) lives in CLAUDE.md.
Benchmarks¶
Real token counts from the Claude API, averaging about 65% output reduction across 10 prompts (range 22-87%).
| Task | Saved |
|---|---|
| Explain React re-render bug | 87% |
| Fix auth middleware token expiry | 83% |
| Set up PostgreSQL connection pool | 84% |
| Implement React error boundary | 87% |
| Average (10 prompts) | 65% |
Raw data and reproduction script are in benchmarks/. A three-arm eval harness (baseline / terse / skill) lives in evals/; caveman is compared against an explicit "Answer concisely." instruction, not the verbose default, so the delta is honest.
Tip
caveman-compress cuts about 46% of input tokens from memory files on average, and that saving repeats every session. See receipts in benchmarks/.
Documentation¶
- INSTALL.md — full install matrix, all flags, per-agent detail
- CLAUDE.md — maintainer guide (file ownership, hook architecture, CI)
- Windows install — extra guides, including Windows install
- Issues — bugs, features, questions
Contributing¶
Patches are welcome. See CONTRIBUTING.md for how to send one.
License¶
FSL-1.1-ALv2. Fork of JuliusBrussee/caveman; original by Julius Brussee.