2 comments

  • dee66 4 hours ago

    I put together a CLI that estimates infrastructure cost deltas from Terraform plans.

    It works by consuming `terraform show -json` output and evaluating cost changes locally using embedded pricing data. There are no cloud API calls, no IAM roles, and no network access at runtime, which keeps the output deterministic and cheap enough to run in CI.

    The motivation came from a specific frustration: cost feedback usually arrives after deploy, when the change is already merged and normalized. Terraform plans already describe irreversible infrastructure decisions, so estimating cost there makes it part of code review instead of a billing post-mortem.

    Design constraints I cared about: - deterministic execution (same input → same output) - fully offline (no IAM, no APIs, no telemetry) - intended to run before merge, not after deploy - directional accuracy rather than billing-exact precision

    Trade-offs (real ones): - it can’t model runtime behavior like autoscaling or traffic - it’s weak at usage-driven costs (especially egress) - it won’t match provider bills to the cent

    What this is explicitly not trying to solve: - unexpected bills from existing infrastructure - org-wide FinOps dashboards - tagging, chargeback, or attribution workflows

    This started as a narrow experiment to answer one question: For Terraform-managed infrastructure, is cost estimation more useful before merge than after deploy?

    One open design question I’m still unsure about: Right now it stays strictly with `terraform show -json` to preserve determinism. Is the UX win of parsing HCL directly worth the loss in reliability, or is JSON-only the right trade-off?

      dsteenman 3 hours ago

      been working in this space too and your design constraints are solid. the JSON-only approach is the right call, parsing HCL directly introduces so many edge cases (modules, dynamic blocks, provider aliasing) that you trade away the determinism you're optimizing for. terraform's JSON output is already the normalized representation of what will actually run.

      one thing i ended up doing differently: i built this as a github app that pulls live pricing from the AWS API instead of embedding it. means the estimates stay current without releases, but you lose the offline guarantee.

      when i initially built it i started out with AWS CDK first since there wasn't really tooling for cost estimation there and i develop infra mainly with cdk, then added terraform support once the architecture was in place, turned out to be relatively straightforward once you have the pricing lookup layer abstracted.

      i also came to the same conclusion on usage-driven costs btw—right now i only do baseline infra cost, not traffic or autoscaling. one idea i'm mulling over: let users drop a config file with approximate usage patterns and use that for estimates. not perfect but better than nothing for things like lambda invocations or data transfer. wrote up some of my learnings and the trade-offs here:

      https://cloudburn.io/blog/cloudburn-lessons-learned