Review what matters.

Entity-level code review for Git. Graph-based risk scoring identifies which functions need careful review. No LLM, no API key. 83.5% recall on the Greptile benchmark, beating every LLM tool. Single commit in 5-67ms.

copied$ cargo install --git https://github.com/Ataraxy-Labs/inspect inspect-cli

~/project

$ inspect diff HEAD~1

inspect 12 entities changed
  1 critical, 4 high, 3 medium, 4 low

groups 3 logical groups:
  [0] src/merge/ (5 entities)
  [1] src/driver/ (4 entities)
  [2] validate (3 entities)

entities (by risk):

  ~ CRITICAL function merge_entities (src/merge/core.rs)
    classification: functional  score: 0.82  blast: 171  deps: 3/12
    public API
    >>> 12 dependents may be affected

  - HIGH function old_validate (src/validate.rs)
    classification: functional  score: 0.65  blast: 8  deps: 0/3
    public API

  + MEDIUM function parse_config (src/config.rs)
    classification: functional  score: 0.32  blast: 0  deps: 2/0

  ~ LOW function format_output (src/display.rs)
    classification: text  score: 0.02  blast: 0  deps: 0/0
    cosmetic only (no structural change)

The problem

git diff says 12 files changed. But which changes actually matter? A renamed variable, a reformatted function, and a deleted public API method all look the same in a line-level diff.

This gets worse with AI-generated code. DORA 2025 found that AI adoption led to +154% PR size, +91% review time, and +9% more bugs shipped. Reviewers are drowning in noise. inspect works at the entity level: functions, structs, traits, classes. It uses the dependency graph to identify which changes have real impact.

How it works

Four phases. No LLM, no network calls, all local.

EXTRACT

Parse

tree-sitter extracts entities from all tracked source files. Builds a full-repo dependency graph via call/reference analysis.

CLASSIFY

Categorize

Compare before/after. Classify each change as text (comments), syntax (signatures), functional (logic), or a combination.

SCORE

Risk

Graph-centric scoring. Dependents and blast radius are the primary signals. Public API, classification, and change type set the baseline.

GROUP

Untangle

Union-Find on dependency edges between changed entities. Separates independent logical changes within tangled commits.

Results

Evaluated on AACR-Bench. 158 PRs, 50 repos, 10 languages, 1,169 ground truth issues from human reviewers.

48%

recall (High/Critical only)

reviewing 9.5% of the diff

78%

recall (High/Critical + Medium)

reviewing 19% of the diff

82%

total coverage

issues within any changed entity

83.5% High/Critical recall on the Greptile benchmark (50 PRs, 5 repos, 97 golden comments), beating every LLM-based tool at zero cost. 100% recall at the Medium threshold. Full benchmark results →

Part of the Ataraxy Labs stack

Three tools, same foundation: sem-core's entity extraction and structural hashing.

sem

Understand code history. What changed, who changed it, what depends on it, what might break.

weave

Merge without false conflicts. 31/31 clean merges on concurrent edit scenarios vs Git's 15/31.

inspect

Review what matters. Graph-based risk scoring, change classification, commit untangling.