the dependency-graph approach makes sense - and its actually why local CLI tools like Cursor, Copilot, Aider etc struggle with impact analysis. They're context-window-constrained by design. Theres no persistent graph tracking what depends on what across repos, config files, call paths, etc. "Just put the whole codebase in context" doesnt really work here. You need something indexed before the LLM even gets involved - infra that already knows the dependency relationships. Local tools are great for "write this function." But "what breaks if I change this migration?" is a totally different beast. Thats not a generation problem - its a graph query that needs server-side indexing to answer properly.
This matches my experience: dialing up top‑k and chunk sizes just made RAG noisier, not smarter. Code reviews require higher level of accuracy than what top-n support
for anyone doing serious code reviews, this lines up with how TuringMind AI has evolved: RAG is nice for “what looks like this code?”, but impact-aware review is almost never a pure similarity problem. What actually matters in review are things like “what configs/migrations/env vars does this touch?”, “what else depends on this function?”, and “are we creating a new unwanted coupling?”, which are fundamentally dependency-graph questions. That’s why TuringMind leans on a multi-language AST + dependency graph for review and then uses embeddings only where they shine (semantic search, exploration, explanations), rather than forcing a vector database to answer impact questions it’s structurally bad at.
the dependency-graph approach makes sense - and its actually why local CLI tools like Cursor, Copilot, Aider etc struggle with impact analysis. They're context-window-constrained by design. Theres no persistent graph tracking what depends on what across repos, config files, call paths, etc. "Just put the whole codebase in context" doesnt really work here. You need something indexed before the LLM even gets involved - infra that already knows the dependency relationships. Local tools are great for "write this function." But "what breaks if I change this migration?" is a totally different beast. Thats not a generation problem - its a graph query that needs server-side indexing to answer properly.
This matches my experience: dialing up top‑k and chunk sizes just made RAG noisier, not smarter. Code reviews require higher level of accuracy than what top-n support
for anyone doing serious code reviews, this lines up with how TuringMind AI has evolved: RAG is nice for “what looks like this code?”, but impact-aware review is almost never a pure similarity problem. What actually matters in review are things like “what configs/migrations/env vars does this touch?”, “what else depends on this function?”, and “are we creating a new unwanted coupling?”, which are fundamentally dependency-graph questions. That’s why TuringMind leans on a multi-language AST + dependency graph for review and then uses embeddings only where they shine (semantic search, exploration, explanations), rather than forcing a vector database to answer impact questions it’s structurally bad at.