Skip to content

caveman-plus

Token-saving output mode for Claude Code and 30+ other AI coding agents: it makes the agent answer in compressed "caveman" prose, cutting roughly 75% of output tokens while keeping full technical accuracy.

plugin-validate License: FSL-1.1-ALv2 Claude Code plugin

Install

Install as a Claude Code plugin:

claude plugin marketplace add 88plug/caveman-plus
claude plugin install caveman-plus@caveman-plus

Or install for every supported agent with one script:

# macOS / Linux / WSL / Git Bash
curl -fsSL https://raw.githubusercontent.com/88plug/caveman-plus/main/install.sh | bash
# Windows (PowerShell 5.1+)
irm https://raw.githubusercontent.com/88plug/caveman-plus/main/install.ps1 | iex

The script needs Node 18 or newer, takes about 30 seconds, skips agents you do not have, and is safe to re-run. Full matrix and per-agent flags are in INSTALL.md.

Quickstart

After install, turn it on in any session:

  • Type /caveman (or say "talk like caveman").
  • Ask a question. The reply comes back compressed.
  • Turn it off with "normal mode" or "stop caveman".

The result is visible in the first reply. A verbose answer like this:

The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle. When you pass an inline object as a prop, React's shallow comparison sees it as a different object every time, which triggers a re-render. I'd recommend using useMemo to memoize the object.

becomes this, with the same fix:

New object ref each render. Inline object prop = new ref = re-render. Wrap in useMemo.

Same answer, far fewer tokens.

Note

caveman-plus changes output tokens only. Reasoning and thinking tokens are untouched, so the model is not made less capable. The main wins are readability, speed, and lower output cost.

What it does

caveman-plus installs a skill that instructs the agent to drop filler, pleasantries, hedging, and articles while keeping every piece of technical substance: code blocks, function names, API names, paths, and error strings stay byte-exact. You get the same correctness in a fraction of the words.

This is the 88plug edition. It ships full-plus as the default intensity, an English-only mode tuned for artifact-first replies (newest ask only, one default path, smallest usable artifact, plain prose for explainers, direct root-cause-plus-fix for bugs).

It is a fork of JuliusBrussee/caveman. Credit for the original caveman goes to Julius Brussee.

Features

  • Compresses every reply until you turn it off; the chosen intensity persists for the session.
  • Seven intensity levels, switchable with one command.
  • Slash commands for commits, PR review, session stats, and memory-file compression.
  • Statusline badge in Claude Code showing lifetime tokens saved.
  • Caveman subagents that keep your main context window lasting longer.
  • MCP middleware that compresses tool descriptions from any MCP server.
  • Works across Claude Code, Codex, Gemini, Cursor, Windsurf, Cline, Copilot, and 30+ more agents.

Intensity levels

Switch with /caveman <level>. Levels stick until the session ends.

Level What it does
lite Drop filler and hedging; keep articles and full sentences. Professional but tight.
full Drop articles, allow fragments, use short synonyms. Classic caveman.
full-plus English-only full tuned for artifact-first replies. 88plug edition default.
ultra Abbreviate prose words, strip conjunctions, use arrows for causality, one word where one word works. Code and error strings never abbreviated.
wenyan-lite Semi-classical Chinese register; keep grammar, drop filler.
wenyan-full Maximum classical Chinese terseness (文言文), 80-90% character reduction.
wenyan-ultra Extreme classical-Chinese abbreviation. Maximum compression.

Auto-activation is built in for Claude Code, Codex, and Gemini. Cursor, Windsurf, Cline, and Copilot get always-on rule files via --with-init. Other agents activate per session with /caveman.

Commands and skills

Command / skill What it does
/caveman [level] Compress every reply at the chosen intensity level.
/caveman-commit Conventional Commit messages, 50-char subject, why over what.
/caveman-review One-line PR comments, for example L42: bug: user null. Add guard.
/caveman-stats Session token usage, lifetime savings, and USD. --share prints a one-liner.
/caveman-compress <file> Rewrite a memory file (e.g. CLAUDE.md) into caveman-speak; saves input tokens every session. Code, URLs, and paths preserved byte-exact.
caveman-shrink MCP middleware that wraps any MCP server and compresses tool descriptions. See npm.
cavecrew-* Caveman subagents (investigator, builder, reviewer) that use fewer tokens than the defaults.
Statusline badge In Claude Code, the statusline shows a badge like `[CAVEMAN] 12.4k` for lifetime tokens saved. It updates on each `/caveman-stats` run. Set `CAVEMAN_STATUSLINE_SAVINGS=0` to silence it.

How it works

  • Install drops a skill file into each agent.
  • The skill tells the agent to drop filler, keep substance, and use fragments.
  • In Claude Code, a SessionStart hook writes a small flag file so the agent talks caveman from the first message, with no need to type /caveman.
  • The stats command reads the Claude Code session log, counts tokens saved, and writes the number to the statusline.
  • The caveman-compress sub-skill rewrites memory files so each session starts with a smaller context, saving tokens on every session rather than one reply.

Maintainer detail (hook architecture, file ownership, CI sync) lives in CLAUDE.md.

Benchmarks

Real token counts from the Claude API, averaging about 65% output reduction across 10 prompts (range 22-87%).

Task Saved
Explain React re-render bug 87%
Fix auth middleware token expiry 83%
Set up PostgreSQL connection pool 84%
Implement React error boundary 87%
Average (10 prompts) 65%

Raw data and reproduction script are in benchmarks/. A three-arm eval harness (baseline / terse / skill) lives in evals/; caveman is compared against an explicit "Answer concisely." instruction, not the verbose default, so the delta is honest.

Tip

caveman-compress cuts about 46% of input tokens from memory files on average, and that saving repeats every session. See receipts in benchmarks/.

Documentation

  • INSTALL.md — full install matrix, all flags, per-agent detail
  • CLAUDE.md — maintainer guide (file ownership, hook architecture, CI)
  • Windows install — extra guides, including Windows install
  • Issues — bugs, features, questions

Contributing

Patches are welcome. See CONTRIBUTING.md for how to send one.

License

FSL-1.1-ALv2. Fork of JuliusBrussee/caveman; original by Julius Brussee.