Table of Contents
Sockets in the Linux Kernel - Part 2: UDP Socket Lookup on Rx
In this article series I like to explore the implementation of sockets in the Linux Kernel and the source code surrounding them. While most of my previous articles focused primarily on OSI Layer 3, this series will attempt a dive into OSI Layer 4. It is very easy for readers of the kernel source code to not see the forest for the trees due to sheer complexity. It is my intent to throw a lifeline here to help navigating the code and to hold on to what is essential.
Articles of the series
- Sockets in the Linux Kernel - Part 3: UDP Sockets SO_REUSEPORT and Kernel v6.131)
Overview
In this 2nd article of the series, I like to dive into the socket lookup for UDP packets, which determines the socket on which an incoming UDP packet will actually be received. While separate receive paths exist for IPv4 and for IPv6 packets, the socket lookup still has a lot of joint components which are being used on both receive paths. Further, the overall way the lookup works is kept nearly identical in both cases. I'll walk through the general packet flow, the involved components like the socket implementation and the UDP socket lookup table, and finally the socket lookup itself. While the implementation of the socket lookup for the TCP protocol also has many similarities and joint components compared to UDP, I intentionally only cover UDP here. Else, this article would simply become too long. I'll try to cover the TCP socket lookup in a separate future article.
Rx Packet Flow
Figure 1 visualizes the receive path of unicast UDP packets in the Linux kernel, encapsulated either in IPv4 or IPv6, from initial packet reception in the NIC driver until being added to the queue of a receiving socket, with special focus on the the socket lookup. Please compare this figure to its counterpart in my previous article Routing Decisions in the Linux Kernel - Part 1: Lookup and packet flow, in which I focus on what happens to received network packets on L3.

In this example here the routing decision determines that the received IPv4/IPv6 packet is to be locally received and not forwarded. Based on that decision, an indirect function call to the local receive handler is performed, which calls ip_local_deliver()
in case of IPv4 and ip6_input()
in case of IPv6. In both cases, the packet now traverses the Netfilter Input hook2) and is then demultiplexed based on the L4 protocol it carries. I described this in the 1st article of this series Sockets in the Linux Kernel - Part 1: L4 Protocol Demultiplexing on Rx. In this case here it is determined to be a UDP packet and thereby function udp_rcv()
is being called in case of IPv4 and udp6_rcv()
in case of IPv6. Both functions do perform some initial checks of the UDP header, checksum, … and then then call function __udp4_lib_lookup()
(IPv4) or function __udp6_lib_lookup()
(IPv6) respectively, which represents the actual socket lookup. Both implementations use a central hash table as back-end, which I'll describe in detail in one of the next sections. If a matching socket is found, the packet is being added to the receive queue of that socket. Else, an ICMP/ICMPv6 error reply message is sent back to the sender of the packet.
Socket structs
Before getting to the actual socket lookup, I like to give you a tour of the building blocks which are involved here, first of all the sockets themselves. In the kernel those are represented by a complex object oriented hierarchy of structs. The implementation represents a deep rabbit hole and I only will go in as deep as necessary here.

Figure 2 shows the instances of structs which are allocated and initialized in the kernel when you use the socket(2) syscall to create yourself a UDP socket in the domain of IPv4, then use the bind(2) syscall to bind this socket to a local IP address and port and then use the connect(2) syscall to connect this socket to a peer IP address and port.
You should be familiar with those syscalls, as they represent the basic tools of socket programming.
All involved kernel socket structs got far more member variables than shown in Figure 2. I only show the ones here which I consider relevant for the topic at hand. So, let's walk through this example: The socket(2) syscall, represented by function __sys_socket()
in the kernel, allocates an instance of struct socket
and an instance of struct udp_sock
, which both hold pointers to each other. The latter includes a hierarchy of sub-structs struct inet_sock
, struct sock
and struct sock_common
, as shown in Figure 2. The bind(2) syscall, represented by function inet_bind()
in the kernel (in case of IPv4), here binds the socket to 127.0.0.1:8080
and saves this IP address in member variable skc_rcv_saddr
and the port 8080
in member variable skc_num
. It calculates and saves hashes and adds your socket to two different hash tables (more on that in the section below). This results in the bind port 8080
additionally being saved as hash in member variable skc_u16hashes[0]
. That detail is relevant, as that member variable is being used in the socket lookup.
The connect(2) syscall, represented by function inet_dgram_connect()
in the kernel (in case of UDP), here connects the socket to peer 127.0.0.2:12345
and saves this peer IP address in member skc_daddr
and the peer port in skc_dport
.
Please be aware that those member variables are frequently being accessed under several different alias names instead of their actual name when being used in different parts of the code. Most of the structs in this hierarchy got a bunch of preprocessor defines which serve as alias names or “shortcuts” to access member variables of the structs which are lower in the hierarchy. See e.g. this list of defines which is part of struct sock. In this article I'll always use the actual name and not one of these alias names when referring to a member variable.

Figure 3 shows the very same example as Figure 2; however, this time for a UDP socket in the domain of IPv6. It shows you the instances of structs which are being allocated and initialized in the kernel when you use the socket(2), bind(2) and connect(2) syscalls in the same way as described above, just this time with IPv6 addresses. Let's walk through this IPv6 variant of the example:
The socket(2) syscall allocates an instance of struct socket
and an instance of struct udp6_sock
, which both hold pointers to each other. The latter includes nearly the same hierarchy of sub-structs as in the IPv4 UDP example. The struct udp6_sock
merely acts as a wrapper around the original hierarchy consisting of struct udp_sock
, struct inet_sock
, struct sock
and struct sock_common
and adds an additional struct ip6_pinfo
. The bind(2) syscall, represented by function inet6_bind()
in the kernel (in case of IPv6), here binds the socket to [2001:db8::1]:8080
and saves this IP address in member variable skc_v6_rcv_saddr
. It saves the port 8080
in member skc_num
and in skc_u16hashes[0]
and adds the socket to two already mentioned hash tables.
The connect(2) syscall here connects the socket to peer [2001:db8::2]:12345
and saves this peer IP address in member skc_v6_daddr
and the peer port in member skc_dport
.
UDP hash tables

The UDP socket lookup table is implemented as a global instance of struct udp_table
whose members are allocated and initialized in function udp_table_init()
, which is called by function udp_init()
during early kernel boot, see Figure 4. It holds two separate hash tables in its members *hash
and *hash2
. The entries (“buckets”) of those hash tables are represented by instances of struct udp_hslot
(that struct has a memory footprint/alignment of 16 bytes). Both hash tables possess the same number of entries, which can vary between min 2^8 and max 2^16 entries and which is determined dynamically during allocation depending on available memory on the system. On one of my systems I observed them each to possess 2^14 = 16384 entries. The allocation size and number of entries can easily be observed, because the kernel outputs a log message during early boot which appears in your dmesg and journald logs:
$ journalctl -b 0 -k -g 'UDP hash table entries' UDP hash table entries: 16384 (order: 7, 524288 bytes, linear) # meaning: # 16384 buckets in *hash a 16 byte # 16384 buckets in *hash2 a 16 byte # 16384 * 32byte = 524288 byte
Further, the (read-only) sysctl udp_hash_entries
can also show you that same number of entries3):
$ sudo sysctl net.ipv4.udp_hash_entries net.ipv4.udp_hash_entries = 16384
As shown in Figure 5 and mentioned in the sections above, the bind(2) syscall adds your socket to both of these hash tables. Thereby these two tables do in fact hold all UDP sockets on the system which are bound to a network address and port. This includes IPv4 and IPv6 sockets, connected and unconnected sockets, and sockets of all network namespaces.

*hash
based on netns + port and in *hash2
based on netns + address + port.
Each table has a slightly different purpose. The udp_table.hash
table is being used to lookup a socket based on the local port number it binds to. The udp_table.hash2
table is being
used to lookup a socket based on the local address and port number it binds to. It is this second table which is being used by the socket lookup for received UPD packets (more on that in the next section). Both tables implement the usual hash table patterns, which means a socket lookup in one of them consists of two steps: First a hash is being calculated, in case of udp_table.hash
based on the network namespace4) + a port number and in case of udp_table.hash2
based on the network namespace + an IP address + a port number.
The calculated hash is being used as array index to determine the correct “bucket” in which the socket you are searching for resides in. Each bucket contains a double linked list of socket instances.
A second step then loops through the socket instances in that bucket and compares socket member variables like the bind address and port to find the correct match. The socket structs contain
the usual connectors (*next
and *prev
pointers) for double linked lists, by which means the bind(2) syscall adds them to the double linked list of their bucket, see again Figures 2 and 3 where I mention those. The 2nd step of the lookup
uses the common container_of()
pointer magic to obtain a pointer to the actual socket struct from the connector pointer. In case you are not yet familiar with those hash table patterns, I recommend to take a look at my article Connection tracking (conntrack) - Part 2: Core Implementation, where I explain the hash table of the connection tracking system (in more detail), which essentially implements the same pattern.
Option: Network namespace with individual instance of struct udp_table
As mentioned above, by default only one global instance of struct udp_table exists and is being used in all network namespaces. Each network namespace still keeps an individual pointer to that global instance in net->ipv4.udp_table. This pointer is initialized in udp_set_table(), which is being called on creation time of each network namespace. The socket lookup actually uses this pointer to access udp_table. However, a feature has been added with kernel 6.2 which optionally enables network namespaces to allocate their own individual instance of struct udp_table. To activate this, you need to set sysctl net.ipv4.udp_child_hash_entries to a value n between 7 and 16 and then create a child network namespace. That child network namespace will then allocate its own instance of udp_table with 2^n entries/buckets; see again function udp_set_table(). You can confirm this by reading sysctl net.ipv4.udp_hash_entries inside the child network namespace.
By default this feature is switched off and the involved sysctls will look like this5):
net.ipv4.udp_child_hash_entries = 0 net.ipv4.udp_hash_entries = 16384 # in main network namespace net.ipv4.udp_hash_entries = -16384 # in other network namespaces
If switched on and set n=8, the involved sysctls will look like this:
net.ipv4.udp_child_hash_entries = 8 net.ipv4.udp_hash_entries = 16384 # in main network namespace net.ipv4.udp_hash_entries = 256 # in child network namespace
Socket lookup IPv4
Now we are finally getting into the meat of things. The actual socket lookup
for locally received IPv4 UDP packets is implemented in function __udp4_lib_lookup()
and visualized in Figure 6.

It mainly consists of two successive lookups into udp_table.hash2[]
. The first lookup
calculates a hash based on the network namespace in which the packet has been received and
the destination IP address and destination UDP port of the packet. Thereby the correct bucket
inside the hash table is determined. Then function udp4_lib_lookup2()
is called, which implements looping through all sockets
in the double linked list of that bucket to find the one socket which matches best.
It calls compute_score()
for each of those sockets which, as the name suggests, computes a “score” for each one and let's the highest score win. To reach the minimum score which is considered a match,
a socket must at least match with the received network packet regarding its network namespace and the IPv4 address and UDP port it binds to. For this comparison, the socket struct member variables skc_net
, skc_rcv_saddr
and skc_u16hashes[0]
are used;
see once again Figure 2. If it is a connected socket and the IPv4 address and UDP port of the peer it is connected to match the source IP address and source port of the packet, then
it achieves a higher score. Here the socket struct members skc_daddr
and skc_dport
are compared6). There are further socket member variables which are checked here and can make it achieve an even higher score, but those are corner cases like e.g a socket which binds to a specific network device7) and I won't go into that here. Thereby, the best matching socket is determined and if it actually is a connected socket, then the whole lookup ends here returning that successful match. If a matching socket has been found, which is not connected, thereby only matches to the destination IP address and destination port of the packet, then this is also considered a successful match and the lookup ends here, too. However, as you can see in Figure 6, an optional
eBPF lookup happens in between those checks (see info box below for details on that).
If all that didn't produce a match, a 2nd lookup is done into udp_table.hash2[]
.
This time the hash is calculated based on the network namespace in which the packet has been received, the destination UDP port of the packet and the “any” IP address 0.0.0.0
, thereby searching for a matching socket which binds to the “any” address (= to all local addresses). The hash determines the bucket and once again function udp4_lib_lookup2()
is called, to loop through the sockets of that bucket to find the best (if any) match. For sockets which bind to the any address, their member variable skc_rcv_saddr
is obviously being set to 0.0.0.0
. Function udp_4_lib_lookup2()
here does not compare this member to the actual destination IP address of the network packet, but instead checks whether it is indeed 0.0.0.0
.
Some additional remarks on how to interpret what happens here:
Based on the sequence of steps described here, you might at first glance get the impression that a connected socket here always wins8) against a not connected socket and that a socket which binds to a specific local IP address always wins against a socket which binds to the any address. Both is actually not the case here. Don't confuse this with TCP. While with TCP, an established socket (= a socket which is already
connected to a peer) which binds to local address A
and port B
, can coexist with another (“listening”) socket which also binds to local address A
and port B
, the socket API doesn't allow this for UDP9). If both cannot coexist at the same time on the system, then also no one can “win” against one another. The same goes for sockets which bind to the any (0.0.0.0
) address and port B
. If such a socket exists, no other UDP socket can exist at the same time on the system which binds to a specific local IP address and also to port B
. The bind(2) syscall won't allow it. So, it is not relevant that the 2nd lookup which checks for the any address actually happens after the lookup for a specific address. Two separate lookups are merely necessary here due to the nature of the hash table, as the specific IP address (or the any address) are part of the calculated hash. You cannot check for both with just one lookup.
eBPF sk_lookup
As shown in Figure 6, an eBPF program can be loaded into the kernel to be executed at this sk_lookup hook. This can be used to override the regular socket lookup and to select a target socket for the network packet to be received at, based on other criteria than the default ones covered by the regular lookup. The initial lookup based on netns + dstAddr + dstPort is still performed before that eBPF hook and matching connected UDP sockets thereby cannot be overridden by an eBPF program. Matches to unconnected UDP sockets however can be overridden. An eBPF program loaded into this hook is being loaded into a specified network namespace. In other words, each network namespace has its own individual eBPF hook here. I collected some useful links for those who like to get deeper into this topic:
Socket lookup IPv6
Despite being implemented separately from the IPv4 lookup described above, the socket lookup for locally received IPv6 UDP packets, visualized in Figure 7, works nearly identical to its IPv4 counterpart.

The entire lookup is implemented in function __udp6_lib_lookup()
. It performs the very same two successive lookups into udp_table.hash2
, with the optional eBPF socket lookup in between. The looping through the hash table buckets here is implemented in function udp6_lib_lookup2()
, which internally uses an IPv6 variant of the compute_score()
function to compute the score of matching sockets. The only real difference to the IPv4 implementation is the obvious fact that IPv6 addresses are being compared here. So here the destination IPv6 address of the received UDP packet is being compared to the socket struct member skc_v6_rcv_addr
and in case of connected sockets, the source IPv6 address of the network packet is compared to member skc_v6_daddr
. Additionally, a socket will only be considered a match if it belongs to address family (domain) PF_INET6
10). That value is stored in socket member skc_family
and clearly categorizes a socket as an IPv6 (or dual-stack) socket. Please compare all this to Figure 3 which shows all these socket member variables.
Dual-stack sockets
When you create an IPv6 UDP socket and bind it to to an existing local address and port and not to the any address [::]
, then this socket is thereby (as you would expect) an IPv6-only socket. Syscall bind(2) in this case sets socket member skc_ipv6only:1
to 1
(take another look at Figure 3), which is equivalent to setting socket option IPV6_V6ONLY
to true
. However, if you create an IPv6 UDP socket and bind it to the IPv6 any address [::]
, then you thereby create a dual-stack socket, which is able to handle IPv4 as well as IPv6 traffic. Precondition is sysctl net.ipv6.bindv6only
being set to 0
(default). This sysctl represents the default value for socket option IPV6_V6ONLY
. Thus, either by setting sysctl net.ipv6.bindv6only=1
11) before creating a socket or by explicitly setting socket option IPV6_V6ONLY
to true
for the socket in question before binding it to the IPv6 any address, you can still enforce this socket to be IPv6-only. This works, because the socket(2) syscall initializes socket struct member skc_ipv6only:1
with the current value of net.ipv6.bindv6only
and after that you still have the opportunity to change the value of that member by setting socket option IPV6_V6ONLY
.
This subtle difference can be made visible with the iproute2 ss
tool. Here an example where ss
lists 3 UDP sockets which all bind to the any address:
$ ss -l Netid State ... Local Address:Port ... udp UNCONN ... 0.0.0.0:111 ... # IPv4 socket (AF_INET) udp UNCONN ... [::]:53 ... # IPv6-only socket (AF_INET6) udp UNCONN ... *:80 ... # dual-stack socket (AF_INET6)
But how does the socket lookup work in case of a dual-stack socket? To answer this, we first need to explain how the socket struct member variables are initialized by the socket(2) and bind(2) syscalls in this case:
socket(AF_INET6, SOCK_DGRAM, IPPROTO_UDP) = 3 bind(3, {sa_family=AF_INET6, sin6_port=htons(8080), sin6_flowinfo=htonl(0), \ inet_pton(AF_INET6, "::", &sin6_addr), sin6_scope_id=0}, 28) = 0
Please compare this to the example shown in Figure 3. I'll only explain here what happens differently in case of a dual-stack socket:
The bind(2) syscall here binds the socket to the IPv6 any address [::]
and saves this IP address in member variable skc_v6_rcv_saddr
. It additionally sets member skc_rcv_saddr
to zero, which means it sets it to the IPv4 any address 0.0.0.0
. It doesn't touch member skc_ipv6only:1
and leaves it at its value 0
already set by the socket(2) syscall. As a result we now have a socket of protocol family AF_INET6
(PF_INET6
) which binds to both [::]
and 0.0.0.0
→ a dual-stack socket.
In case an IPv4 UDP packet with a matching destination port number is received, the IPv4 UDP socket lookup is performed as described above and let's say we arrived now at the 2nd hash table lookup which is based on netns + address 0.0.0.0
+ the packet's destination port (Figure 6). A minor detail I didn't mention so far is that function udp4_lib_lookup2()
, when looping through the sockets in the hash table bucket, actually does not limit its search to IPv4 sockets (sockets of protocol family AF_INET
/PF_INET
). It also considers IPv6 sockets (sockets of protocol family AF_INET6
/PF_INET6
) as a match and it checks for member skc_ipv6only:1
to be 0
12). The compute_score()
function merely gives a PF_INET6
socket a little lower score than a PF_INET
socket (= than a “real” IPv4 socket). Thus, an IPv4 UDP packet can indeed find a match here, assuming that everything else, like the destination port number, fits.
In case an IPv6 UDP packet with a matching destination port number is received, the IPv6 UDP socket lookup is performed exactly as described in the section above and its 2nd hash table lookup, which checks for the any address [::]
, produces a match (Figure 7).
IPv4-mapped IPv6 addresses
In case you are wondering how you can handle receiving IPv4 UDP packets on an AF_INET6
socket:
When doing socket programming with IPv4 (AF_INET
), IP addresses and ports are being handled in form of struct sockaddr_in
with its member sin_addr
of type struct in_addr
which holds an actual IPv4 address. When doing socket programming with IPv6 (AF_INET6
), IP addresses and ports are being handled in form of struct sockaddr_in6
with its member sin6_addr
of type struct in6_addr
which holds an actual IPv6 address. Dual-stack sockets are AF_INET6
sockets and thereby still always use the latter structs for addresses. IPv4 addresses are being handled here in form of IPv4-mapped-IPv6 addresses, which convert an IPv4 address 1.2.3.4
into a pseudo IPv6 address ::ffff:1.2.3.4
so it can be stored within an instance of struct in6_addr
. Those addresses are specified in RFC4192:
| 80 bits | 16 | 32 bits | +--------------------------------------+--------------------------+ |0000..............................0000|FFFF| IPv4 address | +--------------------------------------+----+---------------------+
IPv4-mapped IPv6 addresses do never actually appear on the wire (= within network packet headers). They are merely meant for use in special use cases like this one. So if you receive an IPv4 UDP packet on a dual-stack socket with syscall recvfrom()
, you will be provided with an instance of struct sockaddr_in6
which holds the source IPv4 address of the received packet in form of an IPv4-mapped IPv6 address.
Context
This article describes the source code and behavior of the Linux Kernel v6.12.
Kernel 6.13
This is a perfect example on how quickly documentation ages. While I was still working on this article, kernel 6.13 got released (2025-01-19), which introduced a significant change to the UDP socket lookup. After reviewing this change, I decided to keep this article as it is. No other significant change to the UDP socket lookup had been introduced in quite a while. So this article correctly describes the implementation and behavior of a whole bunch of kernel versions, up to and including 6.12. Further, 6.12 is an LTS kernel which thereby will continue to be used by many people in the foreseeable future. The change introduced with 6.13 does not replace or heavily modify the implementation of 6.12. Instead it adds a 3rd hash table and yet another lookup and this change is mostly only relevant for connected sockets in combination with the socket option SO_REUSEPORT
. I intent to cover both SO_REUSEPORT
and the change of 6.13 in the next article of this series. For now, I'll give you an appetizer on what this change in 6.13 is all about. The mails on the netdev mailing list adding the patchset describe it quite well:
In addition to the existing two hash tables *hash
and *hash2
, a third hash table has been added to struct udp_table
, named *hash4
. It obtains the same number of entries as the other two hash tables and its key feature is that it provides a hash lookup based on netns and a 4-tuple; thus, srcAddr, srcPort, dstAddr, dstPort of a received UDP packet, implemented in new function udp4_lib_lookup4()
. This lookup is added to the existing __udp4_lib_lookup()
function before all other hash lookups; compare to Figure 6. Its intent is to improve performance of the overall lookup in cases where a big number of connected sockets exists. That is commonly the case when running a UDP server application, which handles a big number of connected sockets and is using socket option SO_REUSEPORT
. Sockets are being added to the *hash4
table by syscall connect(2), so those can be found quicker during the lookup, based on the 4-tuple. The same new behavior has been added to the IPv6 UDP socket lookup in function __udp6_lib_lookup()
by newly added function udp6_lib_lookup4()
; compare to Figure 7.
Feedback
Feedback to this article is very welcome! Please be aware that I did not develop or contribute to any of the software components described here. I'm merely some developer who took a look at the source code and did some practical experimenting. If you find something which I might have misunderstood or described incorrectly here, then I would be very grateful, if you bring this to my attention and of course I'll then fix my content asap accordingly.
published 2025-03-02, last modified 2025-03-02
struct udp_table
being used to lookup instances of IPv4 UDP as well as IPv6 UDP sockets, this sysctl is still located in net.ipv4 and there is no equivalent sysctl in net.ipv6.0
. Only then this comparison does actually take place.SO_BINDTODEVICE
SO_REUSEPORT
, but in that case the lookup will anyway work a little different than described here. I'll try to cover that in one of the next articles.AF_INET6
and represents the domain being set by the socket(2) syscall0
for IPv4 sockets and its value for IPv6 sockets is being handled as described above.