Guides
Task-oriented walkthroughs for the things people do with meguri: seeding a frontier, running a crawl loop, serving a durable partition, rebalancing the files, and projecting the cost to fleet scale.
Each guide is built around a job rather than a flag: turning a URL list into a frontier, draining it, keeping it durable across restarts, reshaping the files when a partition runs hot or cold, and working out what the whole thing costs at a hundred billion URLs. They assume you have worked through the quick start.
Seeding a frontier
Turn a Common Crawl URL list into a .meguri checkpoint, and choose the starting priority and crawl delay.
Running a crawl loop
Drain a frontier with meguri run: the staged loop, the logical clock, worker parallelism, and reading the result.
Serving a durable partition
Open a directory as a log-structured partition that recovers on restart, with meguri serve, the durability dial, and a resident budget.
Rebalancing the files
Route hosts with meguri map, snapshot with pack, and consolidate or split partitions with compact, the file side of fleet rebalancing.
Projecting to fleet scale
Measure the real per-partition cost on a corpus slice with meguri bench, and project it to a hundred billion URLs against the named scaling walls.