But the "Usage Guidelines" part of the "License" section at the end of the README says: "License required for: Commercial embedding in products you sell or offering Mantic as a hosted service."
This is not completely true, since it seems that the software is licensed under AGPLv3, which of course allow the use of the software for any purpose, even commercial.
Are you sure it's correct? It says it's AGPL but the explanation given sounds like what you want is actually the LGPL. AGPL is about what happens if you expose a program as a SaaS and is generally banned from any company due to the "viral" nature i.e. a service that used Mantic would need to be fully open sourced even if the code was never distributed.
LGPL is for libraries: you can use an LGPLd program in proprietary software, but you have to make the source of the LGPLd program with modifications available if you distribute it. It doesn't infect the rest of the program, and it doesn't have any clauses that trigger for SaaS scenarios.
Your current explanation doesn't jive with my understanding the AGPL. For example, you cannot realistically sell a service that incorporates an AGPLd component because it'd require you to open source the entire service.
Thanks for the correction, you’re right that I mixed up LGPL and AGPL there. I haven’t updated the license yet but I plan to adjust it so it better matches the usage model and doesn’t create the “everything must be open source” issue you mentioned. Really appreciate you pointing it out. Thanks Mike!
Interesting idea but its very strong path-dependence makes me wary on its general use and reliability. E.g. on project's own codebase querying "extract exports" will've expected to get `src/dependency-graph.ts`, which has an `extractExports` function, 1st rather 7th. (Though in out of ~30 total files, that means gives expected result in top 25%.) Trying to search anything on chromium repo (just "git clone https://chromium.googlesource.com/chromium", no deps/submodules; only ~44k paths in `git ls-files`) returns "Error: Scanner failed to produce scored files. This is a bug."
Hello! Cool tool, I'm going to give it a try on my personal assistant. The vector DB prices look a bit cynical to me, even incredible. Do you think you could break down how you arrived at the cost estimation both for competing vector DBs and Mantic? For example, I use Weaviate at the moment and I don't come close to this cost even at a years perspective with a generous amount of usage from multiple users (~60)
You're absolutely right, it wasn't implemented, just documented. Thanks for catching that!
I just shipped v1.0.13 (literally 5 minutes ago) that implements all three environment variables:
• MANTIC_IGNORE_PATTERNS - Custom glob patterns to exclude files
• MANTIC_MAX_FILES - Limit number of files returned
• MANTIC_TIMEOUT - Search timeout in milliseconds
Also fixed the regex bug that was breaking glob pattern matching.
Appreciate you pointing it out, having users actually test the features is way more valuable than my own QA.
No embeddings at all, neither stored nor on-the-fly.
Instead of converting queries to vectors, Mantic.sh uses structural inference, it ranks files based on path components, folder depth, naming patterns, and git metadata.
So "stripe webhook" matches /services/stripe/webhook.handler.ts highly because the path literally contains both terms in logical positions, not because their embeddings are close in vector space.
"Cognitive" just means it mirrors how developers already think about code organization, it encodes intent into paths, so searching paths directly often beats semantic similarity.
What kind of agent workflow is it called when you post a hastily vibecoded Show HN, feed glaringly obvious user feedback reports one by one into the LLM, wait for the LLM to course-correct (YOU'RE ABSOLUTELY RIGHT!), then push a coincidentally timed "bugfix" while informing the user that their feedback was addressed by said bugfix?
I think there’s a misunderstanding of mantic.sh architecture...
The “weights” as you described in Mantic.sh are not neural network weights. They’re deterministic ranking heuristics, similar to what IDE file pickers use. For example, extension weights, path depth penalties, and filename matches. You can see them directly in brain-scorer.ts (EXTENSION_WEIGHTS).
Correct! The key insight isn't the algorithm itself—it's that structural metadata is enough. Traditional tools assume you need semantic understanding (embeddings), but we found that path structure + filename + recency gets you 90% of the way there in <200ms.
The 'custom ranking' is inspired by how expert developers navigate codebases: they don't read every file, they infer from structure. /services/stripe/webhook.handler.ts is obviously the webhook handler—no need to embed it.
The innovation is removing unnecessary work (content reading, chunking, embedding) and proving that simple heuristics are faster and more deterministic.
Actually, I haven't used JetBrains, didn't know they did something similar until now!
This came from a different angle I read about how the human brain operates on ~20 watts yet processes information incredibly efficiently. That got me thinking about how developers naturally encode semantics into folder structures without realizing it.
The "cognitive" framing is because we're already doing the work of organizing code meaningfully Mantic.sh just searches that existing structure instead of recreating it as embeddings. Turns out path-based search is just efficient pattern matching, which explains why it's so fast.
Interesting to hear JetBrains converged on a similar approach from the IDE side though!
Thanks for trying it! This sounds like a bug. Mantic.sh supports all languages (Python, Rust, Go, Java, etc.) it's language-agnostic since it ranks files by path/filename, not content.
A few debugging questions:
- What query did you run? (e.g., mantic "auth logic")
- What's your project structure? (Is it a monorepo, or does it have a non-standard layout?)
- Can you share the output of mantic "your query" --json?
If it's only returning
package.json, it likely means:
- The query is too generic (e.g., mantic "project"), OR
- The file scanner isn't finding your source files (possible .gitignore issue)
Tip: Try running git ls-files | wc -l in your project, if that returns 0 or a very small number, Mantic won't have files to search.
Happy to debug further if you can share more details!
Fair point the README focuses more on benchmarks than implementation here's the short version:
1. Use `git ls-files` instead of walking the filesystem (huge speed win - dropped Chromium 59GB scan from 6.6s to 0.46s)
2. Parse each file path into components (folders, filename, extension)
3. Score each file based on how query terms match path components, weighted by position and depth
4. Return top N matches sorted by score
The core insight: /services/stripe/webhook.handler.ts already encodes the semantic relationship between "stripe" and "webhook" through its structure. No need to read file contents or generate embeddings.
I should add an architecture doc to the repo, thanks for the nudge.
Interesting and I haven't tried it yet.
But the "Usage Guidelines" part of the "License" section at the end of the README says: "License required for: Commercial embedding in products you sell or offering Mantic as a hosted service."
This is not completely true, since it seems that the software is licensed under AGPLv3, which of course allow the use of the software for any purpose, even commercial.
Its also small enough that a simple "Claude, migrate to Rust" would work to succesfully launder all the code
Thank you! I shipped a new version with the correct license
Are you sure it's correct? It says it's AGPL but the explanation given sounds like what you want is actually the LGPL. AGPL is about what happens if you expose a program as a SaaS and is generally banned from any company due to the "viral" nature i.e. a service that used Mantic would need to be fully open sourced even if the code was never distributed.
LGPL is for libraries: you can use an LGPLd program in proprietary software, but you have to make the source of the LGPLd program with modifications available if you distribute it. It doesn't infect the rest of the program, and it doesn't have any clauses that trigger for SaaS scenarios.
Your current explanation doesn't jive with my understanding the AGPL. For example, you cannot realistically sell a service that incorporates an AGPLd component because it'd require you to open source the entire service.
Thanks for the correction, you’re right that I mixed up LGPL and AGPL there. I haven’t updated the license yet but I plan to adjust it so it better matches the usage model and doesn’t create the “everything must be open source” issue you mentioned. Really appreciate you pointing it out. Thanks Mike!
You're welcome!
Interesting idea but its very strong path-dependence makes me wary on its general use and reliability. E.g. on project's own codebase querying "extract exports" will've expected to get `src/dependency-graph.ts`, which has an `extractExports` function, 1st rather 7th. (Though in out of ~30 total files, that means gives expected result in top 25%.) Trying to search anything on chromium repo (just "git clone https://chromium.googlesource.com/chromium", no deps/submodules; only ~44k paths in `git ls-files`) returns "Error: Scanner failed to produce scored files. This is a bug."
Thanks for the detailed bug report, both issues are fixed in v1.0.15!
Chromium timeout - Increased default to 30s (configurable via MANTIC_TIMEOUT). Now completes ~23s on 481k files.
Ranking (extract exports) - Added two-pass scoring, ultra-fast path/filename first, then lightweight regex extraction of function/class names from top 50 files.
Exact match: +200 boost Partial: +100 Result: dependency-graph.ts (with extractExports) now ranks #1.
Extras from your feedback:
Added Python (def), Rust (fn), Go (func) patterns Better camel/snake/keyword handling
New env var: MANTIC_FUNCTION_SCAN_LIMIT
Performance: +100-200ms overhead. Tested on Chromium (23s) and Cal.com (220ms).
Huge thanks—feedback like yours is gold!
I also got chrome, let me check and I'll get back to you
Hello! Cool tool, I'm going to give it a try on my personal assistant. The vector DB prices look a bit cynical to me, even incredible. Do you think you could break down how you arrived at the cost estimation both for competing vector DBs and Mantic? For example, I use Weaviate at the moment and I don't come close to this cost even at a years perspective with a generous amount of usage from multiple users (~60)
Thanks for the kind words and for giving Mantic.sh a spin, excited to hear how it works for your personal assistant!
The cost estimates were rough illustrations for high-usage cloud setups (100 devs × 100 searches/day = ~3.65M queries/year):
Vector embeddings: ~$0.003/query (OpenAI embeddings + managed DB like Pinecone) → $10,950/yr Sourcegraph: Older Enterprise rate (~$91/user/mo) → $109k/yr Mantic: $0 (local, no APIs/DBs)
You're spot on—these are high-end, Weaviate (esp. self-hosted/compressed) can be way cheaper for moderate use like your ~60 users.
I leaned toward worst-case managed pricing to highlight the "no ongoing cost" upside.
Let me know how the trial feels!
isnt this how code was searched for decades before LLMs came out? just clarifying...
MANTIC_IGNORE_PATTERNS seems not to be implemented, or am I missing something?
You're absolutely right, it wasn't implemented, just documented. Thanks for catching that!
I just shipped v1.0.13 (literally 5 minutes ago) that implements all three environment variables:
• MANTIC_IGNORE_PATTERNS - Custom glob patterns to exclude files • MANTIC_MAX_FILES - Limit number of files returned • MANTIC_TIMEOUT - Search timeout in milliseconds
Also fixed the regex bug that was breaking glob pattern matching.
Appreciate you pointing it out, having users actually test the features is way more valuable than my own QA.
What is "cognitive code search" "without embeddings"? Do you mean you accept natural language queries and create embeddings on-the-fly?
edit: That's structural or syntax-aware search.
No embeddings at all, neither stored nor on-the-fly.
Instead of converting queries to vectors, Mantic.sh uses structural inference, it ranks files based on path components, folder depth, naming patterns, and git metadata.
So "stripe webhook" matches /services/stripe/webhook.handler.ts highly because the path literally contains both terms in logical positions, not because their embeddings are close in vector space.
"Cognitive" just means it mirrors how developers already think about code organization, it encodes intent into paths, so searching paths directly often beats semantic similarity.
I'm guessing this is the ranker: https://github.com/marcoaapfortes/Mantic.sh/blob/main/src/br...
You should benchmark this against other rankers.
Working on it right now
This is just typical trash you get out of LLM. I flagged this AI slop.
Do you want to objectively state your criticisms instead of this handwaving dismissal?
What kind of agent workflow is it called when you post a hastily vibecoded Show HN, feed glaringly obvious user feedback reports one by one into the LLM, wait for the LLM to course-correct (YOU'RE ABSOLUTELY RIGHT!), then push a coincidentally timed "bugfix" while informing the user that their feedback was addressed by said bugfix?
Where you do think all those "weights" come from? They are all hallucinated. The rest of the code is too.
I think there’s a misunderstanding of mantic.sh architecture...
The “weights” as you described in Mantic.sh are not neural network weights. They’re deterministic ranking heuristics, similar to what IDE file pickers use. For example, extension weights, path depth penalties, and filename matches. You can see them directly in brain-scorer.ts (EXTENSION_WEIGHTS).
const EXTENSION_WEIGHTS: Record<string, number> = { '.ts': 20, '.tsx': 20, '.js': 15, '.jsx': 15, '.rs': 20, '.go': 20, '.py': 15, '.prisma': 15, '.graphql': 10, '.css': 5, '.json': 5, '.md': 2 };
There’s no LLM involved in the actual search or scoring, it’s a static heuristic engine, not a learned model.
I'd love to have the implementation critiqued on its merits!
> So "stripe webhook" matches /services/stripe/webhook.handler.ts highly because the path literally contains both terms in logical positions
This sounds a lot like document search ontop of your specific attributes and you have a custom ranking algorithm.
Correct! The key insight isn't the algorithm itself—it's that structural metadata is enough. Traditional tools assume you need semantic understanding (embeddings), but we found that path structure + filename + recency gets you 90% of the way there in <200ms.
The 'custom ranking' is inspired by how expert developers navigate codebases: they don't read every file, they infer from structure. /services/stripe/webhook.handler.ts is obviously the webhook handler—no need to embed it.
The innovation is removing unnecessary work (content reading, chunking, embedding) and proving that simple heuristics are faster and more deterministic.
So this basically competes with fzf?
https://junegunn.github.io/fzf/
so pretty close to how jetbrains ranks files in their search?
Actually, I haven't used JetBrains, didn't know they did something similar until now!
This came from a different angle I read about how the human brain operates on ~20 watts yet processes information incredibly efficiently. That got me thinking about how developers naturally encode semantics into folder structures without realizing it.
The "cognitive" framing is because we're already doing the work of organizing code meaningfully Mantic.sh just searches that existing structure instead of recreating it as embeddings. Turns out path-based search is just efficient pattern matching, which explains why it's so fast.
Interesting to hear JetBrains converged on a similar approach from the IDE side though!
I've tried it on my project and it always finds one file: package.json. What languages it support? Only javascript?
Thanks for trying it! This sounds like a bug. Mantic.sh supports all languages (Python, Rust, Go, Java, etc.) it's language-agnostic since it ranks files by path/filename, not content.
A few debugging questions:
- What query did you run? (e.g., mantic "auth logic") - What's your project structure? (Is it a monorepo, or does it have a non-standard layout?) - Can you share the output of mantic "your query" --json?
If it's only returning package.json, it likely means:
- The query is too generic (e.g., mantic "project"), OR - The file scanner isn't finding your source files (possible .gitignore issue)
Tip: Try running git ls-files | wc -l in your project, if that returns 0 or a very small number, Mantic won't have files to search.
Happy to debug further if you can share more details!
Can you explain how you achieve this in more detail? Did not see any in-detail explanation in the readme in repo
Fair point the README focuses more on benchmarks than implementation here's the short version:
1. Use `git ls-files` instead of walking the filesystem (huge speed win - dropped Chromium 59GB scan from 6.6s to 0.46s) 2. Parse each file path into components (folders, filename, extension) 3. Score each file based on how query terms match path components, weighted by position and depth 4. Return top N matches sorted by score
The core insight: /services/stripe/webhook.handler.ts already encodes the semantic relationship between "stripe" and "webhook" through its structure. No need to read file contents or generate embeddings.
I should add an architecture doc to the repo, thanks for the nudge.
Is it possible that you are advertising mantic as an MCP tool without it actually being one? Or at least please document how to use it as such.
Mantic is absolutely an MCP server! The installation is documented right at the top of the README with one-click install buttons:
For Cursor:
Click the "Install in Cursor" badge at the top of the README, or Use this deep link: https://cursor.com/en-US/install-mcp?name=mantic&config=eyJ0...
For VS Code:
Click the "Install in VS Code" badge, or Use this deep link: https://vscode.dev/redirect/mcp/install?name=mantic&config=%...
Manual Installation: Add this to your MCP settings (e.g., ~/Library/Application Support/Claude/claude_desktop_config.json):
json { "mcpServers": { "mantic": { "type": "stdio", "command": "npx", "args": ["-y", "mantic.sh@latest", "server"] } } }
Once installed, Claude Desktop (or any MCP client) can call the search_codebase tool to find relevant files before making code changes.
The MCP server implementation is in src/mcp-server.ts
if you want to see the code.