Member-only story
How I Scaled a Go Backend to Handle 1 Million Requests per Second
From 100 Requests to 1 Million: My Journey in Scaling a Go Backend
It started as a simple API — just a lightweight Go service handling user authentication and processing payments. Initially, it worked flawlessly with a few thousand requests per day. But as traffic grew, my once snappy backend became sluggish. Latency skyrocketed, database queries became bottlenecks, and the server struggled under high load.
One day, we got featured on a popular news site. Within minutes, traffic surged by 10x. My Go backend was barely keeping up, and the infrastructure team was screaming: “We need more servers!”
That’s when I stepped up and optimized our system to handle 1 million requests per second (RPS). Here’s how I did it.
1. Optimizing Goroutines for True Concurrency
At first, I thought Go’s goroutines were magic. “They’re lightweight! They scale infinitely!” — or so I believed. However, I quickly realized that spawning too many goroutines led to excessive CPU context switching and memory exhaustion.
Fixing Goroutine Overload with a Worker Pool
Instead of launching unlimited goroutines, I built a worker pool to process requests efficiently.
package main
import (
"fmt"
"sync"
"time"
)
const maxWorkers = 100 // Controls concurrency level
const numJobs = 1000000
type Job struct {
ID int
}
func worker(id int, jobs <-chan Job, wg *sync.WaitGroup) {
defer wg.Done()
for job := range jobs {
// Simulating CPU-intensive processing
time.Sleep(1 * time.Millisecond)
fmt.Printf("Worker %d processed job %d\n", id, job.ID)
}
}
func main() {
jobs := make(chan Job, numJobs)
var wg sync.WaitGroup
for w := 1; w <= maxWorkers; w++ {
wg.Add(1)
go worker(w, jobs, &wg)
}
for j := 1; j <= numJobs; j++ {
jobs <- Job{ID: j}
}
close(jobs)
wg.Wait()
}
Why it works: Instead of spawning millions of goroutines, I limited concurrency to maxWorkers
, reducing context switching and improving CPU efficiency.