003 · 2026 · Author
tanuki-slice
Reviewing a 400-comment MR with an LLM blows past every context window. tanuki-slice scrapes the MR via the GitLab API, groups comments by file, and splits them into token-budgeted chunks so each one fits the model and can be processed in isolation.
Runtime
Python · uv
Surface
CLI + library
Source
public
Overview
Scrapes merge request discussions from the GitLab API, groups comments and replies by file, and splits them into token-budgeted chunks that preserve both file locality and thread context. The output feeds LLM and agentic workflows without any one chunk overrunning the model. Exposes a CLI for quick inspection and an importable library for pipelines. Works against gitlab.com or self-hosted GitLab.
What I worked on
- Stdlib-only HTTP — no requests or httpx dependency.
- Greedy bin-packing keeps same-file threads together; oversized threads get their own chunk instead of being silently dropped.
- Deterministic ordering (file path, then discussion id) for stable diffs between runs.
- Resolved threads filtered by default — opt in with --include-resolved.
- Strictly typed, py.typed shipped for downstream consumers.