We've been building Computer Agents for the past year, and today we're sharing it publicly.
The problem we kept hitting: AI can reason impressively, but getting it to do anything meaningful requires duct-taping together a dozen tools. You prompt, copy the output, paste it somewhere, run it manually, fix the errors, repeat. The bottleneck isn't intelligence anymore: it's execution.
What we built: An operating system where AI agents can actually operate. They get isolated environments with a real filesystem, terminal, browser, and the ability to execute code. You give them a task, they figure out how to do it, and they do it.
How it works:
- Each agent runs in an isolated container with its own workspace
- Agents can read/write files, run shell commands, browse the web, and call APIs
- You can run tasks from our web app, API, or messaging apps (Telegram, Discord)
- Results come back with the actual artifacts: files created, code written, data extracted
Example tasks our users run:
- "Fix the TypeScript errors in this repo" → agent clones, fixes, tests, commits
- "Research competitors and create a summary doc" → agent browses, extracts, writes
- "Process this CSV and generate a report" → agent analyzes, visualizes, exports
- "Review this PR for security issues" → agent reads diff, analyzes, comments
Tech stack (for the curious):
- Agents run on GCE with Firecracker-style isolation
- Workspaces sync to GCS for persistence
- We use Claude and GPT-4 for the reasoning layer
- MCP (Model Context Protocol) for tool integrations
- Next.js frontend, Firebase auth, Firestore for state
What surprised us:
The Telegram integration became unexpectedly popular. People run agents from their phone while commuting. "Fix the prod bug" from a train is apparently a thing now.
What we're still figuring out:
- Long-running tasks (hours) and how to handle interruption gracefully
- Cost predictability: agents can go deep, which burns tokens
- The right abstraction for multi-agent workflows
Pricing: Free tier with limited compute, then usage-based. We're not trying to get rich on margins — the goal is making this accessible.
We'd love feedback from HN, especially:
- What would you actually use this for?
- What's missing that would make you switch from your current workflow?
- Anyone else building in this space we should talk to?
Hey HN,
We've been building Computer Agents for the past year, and today we're sharing it publicly.
The problem we kept hitting: AI can reason impressively, but getting it to do anything meaningful requires duct-taping together a dozen tools. You prompt, copy the output, paste it somewhere, run it manually, fix the errors, repeat. The bottleneck isn't intelligence anymore: it's execution.
What we built: An operating system where AI agents can actually operate. They get isolated environments with a real filesystem, terminal, browser, and the ability to execute code. You give them a task, they figure out how to do it, and they do it.
How it works: - Each agent runs in an isolated container with its own workspace - Agents can read/write files, run shell commands, browse the web, and call APIs - You can run tasks from our web app, API, or messaging apps (Telegram, Discord) - Results come back with the actual artifacts: files created, code written, data extracted
Example tasks our users run: - "Fix the TypeScript errors in this repo" → agent clones, fixes, tests, commits - "Research competitors and create a summary doc" → agent browses, extracts, writes - "Process this CSV and generate a report" → agent analyzes, visualizes, exports - "Review this PR for security issues" → agent reads diff, analyzes, comments
Tech stack (for the curious): - Agents run on GCE with Firecracker-style isolation - Workspaces sync to GCS for persistence - We use Claude and GPT-4 for the reasoning layer - MCP (Model Context Protocol) for tool integrations - Next.js frontend, Firebase auth, Firestore for state
What surprised us: The Telegram integration became unexpectedly popular. People run agents from their phone while commuting. "Fix the prod bug" from a train is apparently a thing now.
What we're still figuring out: - Long-running tasks (hours) and how to handle interruption gracefully - Cost predictability: agents can go deep, which burns tokens - The right abstraction for multi-agent workflows
Pricing: Free tier with limited compute, then usage-based. We're not trying to get rich on margins — the goal is making this accessible.
We'd love feedback from HN, especially: - What would you actually use this for? - What's missing that would make you switch from your current workflow? - Anyone else building in this space we should talk to?
Site: https://computer-agents.com Docs: https://computer-agents.com/documentation
Happy to answer questions about the architecture, our agent design, or anything else.