Cloud Migration 3 weeks

Critical service rewritten in Go

Problem

The service in question handled real-time eligibility checks — a synchronous call that sat in the critical path of every user-facing transaction. It was colocated with the main Node.js application on a containerised platform, which meant it shared the same autoscaling group, the same cold-start characteristics, and the same billing profile as the rest of the stack. The problem was that this service had fundamentally different traffic patterns: it ran infrequently during off-hours but needed to burst to several thousand requests per minute during peak windows without any warm-up period.

Because the service lived inside the monolithic container fleet, the team had over-provisioned the entire application tier to accommodate its latency requirements. P99 response times were sitting at 850ms — acceptable for some parts of the product, but far too high for eligibility checks, where a slow response directly blocked a user action. Scaling the fleet to bring that number down meant paying for capacity that the rest of the application simply did not need. The monthly compute bill had crept to $320 just for this one service, with no clear ceiling as traffic continued to grow.

What we did

We extracted the eligibility logic into a standalone Go service and deployed it as a 2nd-generation Cloud Function. The choice of Go was deliberate: its compiled binary size is small enough to keep cold starts well under 100ms, its concurrency model handles bursting invocations without the overhead of a thread-per-request model, and the standard library covers everything this service needed — HTTP handling, JSON serialisation, and lightweight connection pooling to the upstream data source. There was no framework to configure and no runtime to manage.

Cloud Functions 2nd gen gave us per-invocation billing, a concurrency setting of up to 1,000 simultaneous requests per instance, and direct integration with Cloud Build for our CI pipeline. We wrote the build and deploy steps as a single cloudbuild.yaml — on merge to main, Cloud Build compiles the binary, runs the test suite, and promotes the new revision with zero-downtime traffic splitting. The entire deployment pipeline, including integration tests against a staging instance, completes in under four minutes. We kept the interface contract identical to the original so the calling application required no changes beyond swapping the endpoint URL.

Result

The rewrite brought P99 latency from 850ms down to 45ms, measured end-to-end from the caller’s perspective including network round-trip. Cold starts — the scenario we were most cautious about given the bursty traffic pattern — came in consistently under 80ms in production, which is below the threshold that users perceive as a delay. At peak load we observed sustained throughput above 4,000 invocations per minute across automatically scaled instances, with no manual intervention and no pre-warming required.

The cost reduction was the result we found most striking to present to the client. By moving from a provisioned container fleet to pay-per-invocation, the monthly compute cost for this service dropped from $320 to $12 — a 96% reduction. Idle periods cost effectively nothing. The client now has a service that is independently deployable, independently scalable, and isolated from any incident or configuration change that affects the main application tier. The three-week timeline included the rewrite, full test coverage, the CI pipeline, and a staged rollout with a two-day observation window before full traffic cutover.

Key highlights

P99 latency dropped from 850ms to 45ms
Monthly compute cost reduced from $320 to $12
Cold start under 80ms — no user-visible delay
Handles 4,000+ invocations/minute at peak

Tech stack

GoCloud Functions (2nd gen)Cloud Build

Have a similar challenge?

Book a call