Level Up Coding

Coding tutorials and news. The developer homepage gitconnected.com && skilled.dev && levelup.dev

Follow publication

How I Scaled a Go Backend to Handle 1 Million Requests per Second

From 100 Requests to 1 Million: My Journey in Scaling a Go Backend

Renaldi Purwanto
Level Up Coding
Published in
4 min readMar 19, 2025

Photo by Roman Synkevych on Unsplash

It started as a simple API — just a lightweight Go service handling user authentication and processing payments. Initially, it worked flawlessly with a few thousand requests per day. But as traffic grew, my once snappy backend became sluggish. Latency skyrocketed, database queries became bottlenecks, and the server struggled under high load.

One day, we got featured on a popular news site. Within minutes, traffic surged by 10x. My Go backend was barely keeping up, and the infrastructure team was screaming: “We need more servers!”

That’s when I stepped up and optimized our system to handle 1 million requests per second (RPS). Here’s how I did it.

1. Optimizing Goroutines for True Concurrency

At first, I thought Go’s goroutines were magic. “They’re lightweight! They scale infinitely!” — or so I believed. However, I quickly realized that spawning too many goroutines led to excessive CPU context switching and memory exhaustion.

Fixing Goroutine Overload with a Worker Pool

Instead of launching unlimited goroutines, I built a worker pool to process requests efficiently.

package main

import (
"fmt"
"sync"
"time"
)

const maxWorkers = 100 // Controls concurrency level
const numJobs = 1000000
type Job struct {
ID int
}

func worker(id int, jobs <-chan Job, wg *sync.WaitGroup) {
defer wg.Done()
for job := range jobs {
// Simulating CPU-intensive processing
time.Sleep(1 * time.Millisecond)
fmt.Printf("Worker %d processed job %d\n", id, job.ID)
}
}

func main() {
jobs := make(chan Job, numJobs)
var wg sync.WaitGroup
for w := 1; w <= maxWorkers; w++ {
wg.Add(1)
go worker(w, jobs, &wg)
}
for j := 1; j <= numJobs; j++ {
jobs <- Job{ID: j}
}
close(jobs)
wg.Wait()
}

Why it works: Instead of spawning millions of goroutines, I limited concurrency to maxWorkers, reducing context switching and improving CPU efficiency.

Create an account to read the full story.

The author made this story available to Medium members only.
If you’re new to Medium, create a new account to read this story on us.

Or, continue in mobile web

Already have an account? Sign in

Written by Renaldi Purwanto

Backend Developer | Golang Enthusiast | Passionate about scalable systems, APIs, and backend solutions

Responses (11)

Write a response

This post is a must-read for anyone building scalable systems in Go. Thank you for sharing your knowledge—I’ll be referring back to this!

--

Thank you Renaldi, nice article.
This is a good list of features to remember when large concurrency is needed.
I never had to use something like redis. I now have to go and read about it. Thx.
A few years ago I worked on a loadbuild app in Go.
My module…

--

replicas: 10
How this is auto scale? Its fixed replica right ?

--