Scaling techniques in the stack for servers with high connection rates


One Line Summary

This talk will describe some techniques for scaling front end servers. In particular, we will describe the the use of SO_REUSEPORT in scaling servers for high connection rates with a single listening port.


The networking stack does not scale well under high connection rate to a single TCP port, or a high packet rate to an unconnected UDP port. For instance, a web server listening on port 80 would use a single listening socket which becomes a bottleneck when attempting to accept connections in parallel from several threads. The single socket for a UDP port (for instance a DNS server on port 53) creates a similar bottleneck. The source of these bottlenecks is contention on he socket lock and cacheline bouncing of the socket structure.

To address this problem we have implemented SO_REUSEPORT. This allows multiple listener sockets to be bound to the same TCP port, or multiple unconnected sockets to be bound to the same UDP port. With SO_REUSEPORT each listener thread in a server can have its own socket which can be used without contention. The kernel performs a demux on incoming connection requests or packets under fine grained locking, and packets can be evenly distributed amongst listener threads for good load distribution.

We have applied this techniques to our web servers and DNS servers and have seen nice gains in QPS as well as balancing load across threads.


  • Ying Cai



    Host Networking engineer at Google.

Leave a private comment to organizers about this proposal