We recently launched Upstash Box — a cloud computer for your agents with durable storage, serverless scaling, and usage-based pricing. Since day one of building the backend, the team has been pretty fired up about it.
Having Box changed how fast I can prototype. I can go from idea to working multi-agent flow much faster now. I could do that before too, but honestly I would postpone half of them.
I don't think agents are magic, but they are very useful on tired days. Some days are long, full of incidents and context switching, and you still need to ship.
And guess what? Bugs are watching us from the edge cases.
I care a lot about not shipping avoidable bugs. The problem is that after a long day, I miss things I normally would catch.
So I wrote my own PR review bot and called it Nitpick.
Why I built Nitpick
Agents were already part of my workflow for reviews, even for self-review. The first step was always the same: throw the change at multiple agents, compare outputs, validate findings, then manually triage everything.
This already helped a lot:
- Fewer wrong findings
- Better coverage of "what can go wrong?"
- A second (third, fourth) brain when mine is tired
But the process itself was the problem. I was doing the same thing every time and it was boring me to death.
So I automated it. Mostly for myself.
Let me introduce you to Nitpick.
What Nitpick does
Nitpick runs a full PR review arena from your terminal.
You give it a GitHub PR, and it spins up multiple reviewer roles in parallel, runs scanners, verifies findings, lets you triage them, and gives you a final verdict with a merge recommendation.
It comes with:
- 5 AI reviewer roles:
security,performance,architecture,testing,dx - 3 automated scanners:
secrets,linter,dependencies - PR summary and walkthrough before findings
- A separate verifier agent to confirm/adjust/reject findings
- Triage flow to accept or dismiss each finding
- Markdown report with blockers, risk score, suggested commits
- Optional GitHub PR review comments (
--post-review)
So instead of review chaos, you get a repeatable flow.
The flow I wanted (and now have)
- Pick repository and PR interactively (or pass the PR URL directly)
- Choose reviewer roles
- Run all reviewers and scanners in parallel
- Read AI summary of what changed and where risk is concentrated
- Triage findings one by one
- Generate final verdict: merge / merge with caution / block
- Optionally post review comments back to GitHub
That is basically it. Fast and repeatable.
How Nitpick uses Upstash Box
The hard part of running five reviewers at once is not the AI calls. It is giving each one its own isolated environment where it can clone the repo, read files, run linters, and do its thing without stepping on the others. That is what Box handles.
Each reviewer role gets its own Box. The security reviewer is digging through auth flows in one container while the performance reviewer is profiling hot paths in another. They do not share state, they do not block each other, and when they are done, their findings get collected and passed to the verifier.
The nice part is I did not have to think about any of the infra. Box is serverless — containers spin up when Nitpick needs them and go away when the review is done. I do not manage instances, I do not pay for idle time.
Nitpick is stateless between runs for now. Each review starts fresh, produces a markdown report, and optionally posts comments back to GitHub. But Box supports durable storage, so persisting review history across runs is something I want to explore next.
Why this matters to me
Nitpick is not about replacing engineering judgment. It is about backing me up when energy is low and I am rushing.
After a long day, your brain misses edges. Nitpick keeps looking anyway.
And the best part: it fits exactly how I already work. I was already inviting multiple agents into the process. I just stopped doing it manually and turned it into a tool.
Now the checks are done before I even think about running them.
I also added tiny graphics to make the process more fun than watching lines of text flow in your terminal. They do not add extra functionality, but they add a bit of joy.
Closing
If you also feel bugs are hiding in edge cases, Nitpick might be useful for you.
I built it because I wanted better review quality without adding more mental overhead. If it helps other teams ship safer and faster, even better.
My laziness still exists. Now it just has better tooling around it.
