tanuki-slice

Reviewing a 400-comment MR with an LLM blows past every context window. tanuki-slice scrapes the MR via the GitLab API, groups comments by file, and splits them into token-budgeted chunks so each one fits the model and can be processed in isolation.

Runtime

Python · uv

Surface

CLI + library

Source

public

Overview

Scrapes merge request discussions from the GitLab API, groups comments and replies by file, and splits them into token-budgeted chunks that preserve both file locality and thread context. The output feeds LLM and agentic workflows without any one chunk overrunning the model. Exposes a CLI for quick inspection and an importable library for pipelines. Works against gitlab.com or self-hosted GitLab.

What I worked on

Stdlib-only HTTP — no requests or httpx dependency.
Greedy bin-packing keeps same-file threads together; oversized threads get their own chunk instead of being silently dropped.
Deterministic ordering (file path, then discussion id) for stable diffs between runs.
Resolved threads filtered by default — opt in with --include-resolved.
Strictly typed, py.typed shipped for downstream consumers.