1 comments

  • cyrill_makarov 3 hours ago

    Finnish Train Introduced A Bug In My App

    And I was only able to solve it because of a real engineering degree

    Ariana's architecture is not the most trivial one. Our backend service spawns VPS' on Hetzner Cloud (what we call Agent Machines), establishes temporary SSH connection with these machines to make the necessary installations and configuration on each machine. After that, all communication with the same machine happens trough HTTP: healthchecks, CRUDs, everything.

    I was on a train going from Tampere to Rovaniemi (Arctic Circle) in Finland. Instead of making a wishlist for Santa, I was working on Ariana in the train and therefore, I was running our backend *locally*. I needed my backend to:

    1. Spawn Machine (was working fine) 2. SSH to the machine, install stuff, start a server, self-healthcheck (was also working fine) 3. poll the machine's status with HTTP (started to fail suddenly after some minutes of journey)

    I SSH'd manually into the servers. The services were running. Health checks passing. Logs showing everything was fine on their end. The machines were alive and well, happily waiting for requests that, from their perspective, never arrived.

    I armed myself with Claude Code and for 4 hours we were trying to fix that. We redesigned the firewall configuration of the server altogether, changed opened ports, added 15 new env variables. Changed a lot of code of course, because every small detail counts. We tried everything. At some point, I've got a message I never had from Claude:

    "This is strange" he says...

    Everything seemed to be properly configured.

    I gave up on using Claude and took control over the code by myself, and started brainstorming. After some time, I figured out that I was using train's free wifi network. And you know what? This network just blocked all outbound HTTP to my machines at Hetzner. Why? Why SSH was still working fine? No idea why. HTTP and HTTPS connections to other services are working fine, this was something specific. Exactly the same configuration on Scaleway worked fine.

    I switched to my mobile network sharing, and the bug just disappeared

    Claude never once suggested "try a different network connection." And I don't say this to criticize the tool—I use it constantly, and it's genuinely useful. But this moment crystallized something important for me about the current state of AI in software engineering. It never steps outside the frame.

    The Lessons:

    1. Next time I'm on a Finnish train, I'm bringing a book. Maybe something recommended by one of our investors. 2. CS degree + Critical thinking > AI agent. And moments like this remind me of the value of human in software engineering.

    Software engineering isn't just about solving problems. It's about _framing_ problems. And knowing which questions to ask, knowing when to zoom out, knowing when to try the stupidly simple thing that's still a deeply human skill.