Member-only story

How I Scaled a Go Backend to Handle 1 Million Requests per Second

From 100 Requests to 1 Million: My Journey in Scaling a Go Backend

Renaldi Purwanto

Published in

Level Up Coding

4 min readMar 19, 2025

It started as a simple API — just a lightweight Go service handling user authentication and processing payments. Initially, it worked flawlessly with a few thousand requests per day. But as traffic grew, my once snappy backend became sluggish. Latency skyrocketed, database queries became bottlenecks, and the server struggled under high load.

One day, we got featured on a popular news site. Within minutes, traffic surged by 10x. My Go backend was barely keeping up, and the infrastructure team was screaming: “We need more servers!”

That’s when I stepped up and optimized our system to handle 1 million requests per second (RPS). Here’s how I did it.

1. Optimizing Goroutines for True Concurrency

At first, I thought Go’s goroutines were magic. “They’re lightweight! They scale infinitely!” — or so I believed. However, I quickly realized that spawning too many goroutines led to excessive CPU context switching and memory exhaustion.

Fixing Goroutine Overload with a Worker Pool

Instead of launching unlimited goroutines, I built a worker pool to process requests efficiently.

package main

import (
    "fmt"
    "sync"
    "time"
)

const maxWorkers = 100 // Controls concurrency level
const numJobs = 1000000
type Job struct {
    ID int
}

func worker(id int, jobs <-chan Job, wg *sync.WaitGroup) {
    defer wg.Done()
    for job := range jobs {
        // Simulating CPU-intensive processing
        time.Sleep(1 * time.Millisecond)
        fmt.Printf("Worker %d processed job %d\n", id, job.ID)
    }
}

func main() {
    jobs := make(chan Job, numJobs)
    var wg sync.WaitGroup
    for w := 1; w <= maxWorkers; w++ {
        wg.Add(1)
        go worker(w, jobs, &wg)
    }
    for j := 1; j <= numJobs; j++ {
        jobs <- Job{ID: j}
    }
    close(jobs)
    wg.Wait()
}

Why it works: Instead of spawning millions of goroutines, I limited concurrency to maxWorkers, reducing context switching and improving CPU efficiency.

Level Up Coding

How I Scaled a Go Backend to Handle 1 Million Requests per Second

From 100 Requests to 1 Million: My Journey in Scaling a Go Backend

1. Optimizing Goroutines for True Concurrency

Fixing Goroutine Overload with a Worker Pool

Create an account to read the full story.

Published in Level Up Coding

Written by Renaldi Purwanto

Responses (11)