blog:linux:connection_tracking_1_modules_and_hooks
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
blog:linux:connection_tracking_1_modules_and_hooks [2021-04-21] – added buzzword "conntrack" to header Andrej Stender | blog:linux:connection_tracking_1_modules_and_hooks [2023-08-15] (current) – improved/fixed some statements, typos Andrej Stender | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | {{tag> | + | {{tag> |
====== Connection tracking (conntrack) - Part 1: Modules and Hooks ====== | ====== Connection tracking (conntrack) - Part 1: Modules and Hooks ====== | ||
~~META: | ~~META: | ||
date created = 2021-04-04 | date created = 2021-04-04 | ||
~~ | ~~ | ||
- | |||
- | ~~NOTOC~~ | ||
With this article series I like to take a closer look at the connection tracking subsystem of the Linux kernel, | With this article series I like to take a closer look at the connection tracking subsystem of the Linux kernel, | ||
which provides the basis for features like stateful packet filtering and NAT. | which provides the basis for features like stateful packet filtering and NAT. | ||
I refer to it as the "ct system" | I refer to it as the "ct system" | ||
- | It is not my intention to replace or repeat existing documentation. Great articles on the topic already exist, however most of them are a little bit dated; see [[# | + | It is not my intention to replace or repeat existing documentation. Great articles on the topic already exist, however most of them are a little bit dated; see [[# |
===== Articles of the series ===== | ===== Articles of the series ===== | ||
- | * [[connection_tracking_1_modules_and_hooks|Connection tracking - Part 1: Modules and Hooks]] | + | * [[connection_tracking_1_modules_and_hooks|Connection tracking |
- | * [[connection_tracking_2_core_implementation|Connection tracking - Part 2: Core Implementation]] | + | * [[connection_tracking_2_core_implementation|Connection tracking |
- | * Connection tracking - Part 3: Connection States | + | * [[connection_tracking_3_state_and_examples|Connection tracking |
===== Overview ===== | ===== Overview ===== | ||
What is the purpose of connection tracking and what does it do? Once activated, connection tracking (the ct system inside the Linux kernel) examines IPv4 and/or IPv6 network packets and their payload, with the intention to determine which packets are associated with each other, e.g. in the scope of a connection-oriented protocol like TCP. The ct system performs this task as a transparent observer and does not take active part in the communication between endpoints. It is not relevant for the ct system, whether the endpoints of a connection are local or remote. They could be located on remote hosts, in which case the ct system would observe them while running on a host which merely is routing or bridging the packets of a particular connection. Alternatively, | What is the purpose of connection tracking and what does it do? Once activated, connection tracking (the ct system inside the Linux kernel) examines IPv4 and/or IPv6 network packets and their payload, with the intention to determine which packets are associated with each other, e.g. in the scope of a connection-oriented protocol like TCP. The ct system performs this task as a transparent observer and does not take active part in the communication between endpoints. It is not relevant for the ct system, whether the endpoints of a connection are local or remote. They could be located on remote hosts, in which case the ct system would observe them while running on a host which merely is routing or bridging the packets of a particular connection. Alternatively, | ||
- | The ct system maintains an up-to-date (live) list of all tracked connections. Based on that it " | + | The ct system maintains an up-to-date (live) list of all tracked connections. Based on that it " |
The ct system itself does never alter/ | The ct system itself does never alter/ | ||
Line 35: | Line 33: | ||
nft add rule ip filter forward iif eth0 ct state established accept | nft add rule ip filter forward iif eth0 ct state established accept | ||
</ | </ | ||
- | < | + | < |
<code bash> | <code bash> | ||
nft add table ip filter | nft add table ip filter | ||
Line 76: | Line 74: | ||
===== Netfilter hooks ===== | ===== Netfilter hooks ===== | ||
- | Like Iptables and Nftables, the ct system is built on top of the Netfilter framework. It implements | + | Like Iptables and Nftables, the ct system is built on top of the Netfilter framework. It implements |
- | If you are not yet very familiar with Netfilter hooks, better first take a look at my other article [[nftables_packet_flow_netfilter_hooks_detail|Nftables - Packet flow and Netfilter hooks in detail]], before proceeding here. From the bird's eye view, the famous | + | If you are not yet very familiar with Netfilter hooks, better first take a look at my other article [[nftables_packet_flow_netfilter_hooks_detail|Nftables - Packet flow and Netfilter hooks in detail]], before proceeding here. From the bird's eye view, the //Netfilter Packet Flow// image shown in Figure {{ref> |
<figure nfpackflowofficial> | <figure nfpackflowofficial> | ||
Line 84: | Line 82: | ||
</ | </ | ||
- | The blocks named // | + | The blocks named // |
===== Module nf_conntrack ===== | ===== Module nf_conntrack ===== | ||
- | Let's get back to the example above and take a look at the kernel module of the ct system | + | Let's get back to the example above and take a look at the kernel module of the ct system |
- | '' | + | '' |
- | The Nftables rules shown in the example above specify //address family// '' | + | The Nftables rules shown in Figure {{ref> |
<figure nfcthooks1> | <figure nfcthooks1> | ||
{{ : | {{ : | ||
< | < | ||
- | The four conntrack | + | The four conntrack |
</ | </ | ||
</ | </ | ||
- | While function '' | + | While function '' |
- | Both functions internally do reference counting. This means that in the current network namespace, maybe one, maybe several kernel components at some point require connection tracking and thereby call '' | + | Both functions internally do reference counting. This means that in the current network namespace, maybe one, maybe several kernel components at some point require connection tracking and thereby call '' |
- | ==== The main ct hook callbacks | + | ==== The main ct hook functions |
- | The two hook callbacks | + | The two hook functions |
- | // | + | |
- | are the very same // | + | |
- | their placement... the one in the // | + | |
- | on the network while the one in the //Output// hook handles outgoing packets generated on this host. | + | |
- | These two can be considered the " | + | |
- | of what the ct system does with traversing network packets happens inside | + | |
- | them... analyzing and associating packets with tracked connections, | + | |
- | ==== The help+confirm | + | ==== The help+confirm |
- | Another two hook callbacks | + | Another two hook functions |
hook and in the // | hook and in the // | ||
- | MAX means the highest possible unsigned integer value. A callback | + | MAX means the highest possible unsigned integer value. A hook function |
priority will be traversed as the very last one within the Netfilter hook and no | priority will be traversed as the very last one within the Netfilter hook and no | ||
- | other callback | + | other hook function |
here are not shown in Figure {{ref> | here are not shown in Figure {{ref> | ||
some internal thing which is not worth mentioning on the bird's eye view. | some internal thing which is not worth mentioning on the bird's eye view. | ||
Line 123: | Line 114: | ||
both is their placement in the Netfilter hooks, which makes sure that ALL | both is their placement in the Netfilter hooks, which makes sure that ALL | ||
network packets, no matter if incoming/ | network packets, no matter if incoming/ | ||
- | of them as the very last thing after having traversed all other callbacks. | + | of them as the very last thing after having traversed all other hook functions. |
- | I refer to them as the //conntrack " | + | I refer to them as the //conntrack " |
series, hinting that they got two independent purposes. One is to execute | series, hinting that they got two independent purposes. One is to execute | ||
" | " | ||
specific use cases and I won't cover that topic in the scope of this first | specific use cases and I won't cover that topic in the scope of this first | ||
- | article. The second is to " | + | article. The second is to " |
- | the sections below on what that means. | + | I'll elaborate on what that means in the sections below. |
<WRAP round info> | <WRAP round info> | ||
Only in recent kernel versions by the time of writing (here kernel v5.10.19) | Only in recent kernel versions by the time of writing (here kernel v5.10.19) | ||
- | both mentioned features " | + | both mentioned features, " |
- | within the same hook callbacks. Not too long ago both still existed in form of | + | within the same hook functions. Not too long ago both still existed in form of |
- | separate ct callbacks | + | separate ct hook functions |
- | Netfilter hooks: The " | + | Netfilter hooks: The " |
- | 300 and the " | + | 300 and the " |
See e.g. [[https:// | See e.g. [[https:// | ||
[[https:// | [[https:// | ||
Line 145: | Line 136: | ||
===== Modules nf_defrag_ipv4/ | ===== Modules nf_defrag_ipv4/ | ||
- | As shown above, module '' | + | As shown in Figure {{ref> |
<figure nfdefraghooks1> | <figure nfdefraghooks1> | ||
{{ : | {{ : | ||
- | < | + | < |
</ | </ | ||
</ | </ | ||
- | Like the ct system itself, those defrag modules do not become globally active on module load. They export (=provide) functions '' | + | Like the ct system itself, those defrag modules do not become globally active on module load. They export (=provide) functions '' |
- | The ct system' | + | Figure {{ref> |
+ | This function is being registered | ||
+ | The ct system' | ||
===== Hooks Summary ===== | ===== Hooks Summary ===== | ||
Figure {{ref> | Figure {{ref> | ||
- | //contrack// and | + | //conntrack// and |
- | // | + | // |
of Iptables. For completeness I also show the priority values here. This should | of Iptables. For completeness I also show the priority values here. This should | ||
provide for a comfortable comparison to what you see in the official //Netfilter | provide for a comfortable comparison to what you see in the official //Netfilter | ||
Line 166: | Line 159: | ||
<figure nfhooks-complete1> | <figure nfhooks-complete1> | ||
{{ : | {{ : | ||
- | < | + | < |
</ | </ | ||
Line 174: | Line 167: | ||
like that. Thus, showing the old but well known Iptables chains still seemed | like that. Thus, showing the old but well known Iptables chains still seemed | ||
like the most pragmatic thing to do. | like the most pragmatic thing to do. | ||
- | The important thing which Figure {{ref> | + | The important thing which Figure {{ref> |
- | They all first traverse one of the // | + | They all first traverse one of the // |
- | or the //Output// hook. This ensures that these callbacks | + | or the //Output// hook. This ensures that these function(s) |
before the ct system is able to see them. After that, the packets traverse a potential | before the ct system is able to see them. After that, the packets traverse a potential | ||
Iptables chain of the raw table (if existing / in use) and then one of the main // | Iptables chain of the raw table (if existing / in use) and then one of the main // | ||
- | callbacks | + | hook functions |
which are commonly used for packet filtering, are traversed after that. Then, as the very | which are commonly used for packet filtering, are traversed after that. Then, as the very | ||
- | last thing the packets traverse one of the //conntrack " | + | last thing the packets traverse one of the //conntrack " |
===== How it works... ===== | ===== How it works... ===== | ||
- | I know... so far I kept beating around the bush. Now let's finally talk about how the ct system actually operates and what it does to network packets traversing its hook callbacks. Please be aware that what I describe in this section are the basics and does not cover all what the ct system actually does. The ct system maintains the connections which it is tracking in a central table. Each tracked connection is represented by an instance of '' | + | I know... so far I kept beating around the bush. Now let's finally talk about how the ct system actually operates and what it does to network packets traversing its hook functions. Please be aware that what I describe in this section are the basics and does not cover all what the ct system actually does. The ct system maintains the connections which it is tracking in a central table. Each tracked connection is represented by an instance of '' |
- | - It is either | + | - It is either part of or related to one of its tracked connections. |
- | - It is the first packet of a new connection which is not yet tracked. | + | - It is the first seen packet of a connection which is not yet tracked. |
- It is an invalid packet, which is broken or doesn' | - It is an invalid packet, which is broken or doesn' | ||
- It is marked as NOTRACK, which tells the ct system to ignore it. | - It is marked as NOTRACK, which tells the ct system to ignore it. | ||
Line 195: | Line 188: | ||
and //Input// hooks and then is received by a local socket. As pointed out | and //Input// hooks and then is received by a local socket. As pointed out | ||
in the previous section, what the ct system does here also applies to outgoing | in the previous section, what the ct system does here also applies to outgoing | ||
- | or forwarded network packets as well. Thus, no need for separate | + | or forwarded network packets as well. Thus, no need for additional |
<figure nfct-lookup> | <figure nfct-lookup> | ||
{{ : | {{ : | ||
< | < | ||
- | Network packet traversing ct main callback | + | Network packet traversing ct main hook function |
ct table finds that packet belongs to already tracked connection, | ct table finds that packet belongs to already tracked connection, | ||
packet is given pointer to that connection. | packet is given pointer to that connection. | ||
Line 207: | Line 200: | ||
Figure {{ref> | Figure {{ref> | ||
- | being part of an already tracked connection. When that packet traverses the main // | + | being part of an already tracked connection. When that packet traverses the main // |
+ | Further, the OSI layer 4 protocol of the packet is now being analyzed and latest protocol state and details are saved to its tracked connection instance. Then the packet continues on its way through other hook functions | ||
<figure nfct-new> | <figure nfct-new> | ||
{{ : | {{ : | ||
< | < | ||
- | Packet traversing ct main callback | + | Packet traversing ct main hook function |
finds no match, packet is considered first one of new connection, new connection | finds no match, packet is considered first one of new connection, new connection | ||
is created and packet is given pointer to it, new connection is later " | is created and packet is given pointer to it, new connection is later " | ||
- | and added to ct table in " | + | and added to ct table in " |
</ | </ | ||
</ | </ | ||
Line 221: | Line 215: | ||
Figure {{ref> | Figure {{ref> | ||
being the first one representing a new connection which is not yet tracked by the ct system. | being the first one representing a new connection which is not yet tracked by the ct system. | ||
- | When that packet traverses the main // | + | When that packet traverses the main // |
- | it passes the already mentioned validity checks. However, in this case the lookup in the ct table (1) does not find a matching connection. As a result, the ct system considers the packet to be the first one of a new connection. A new instance of '' | + | it passes the already mentioned validity checks. However, in this case the lookup in the ct table (1) does not find a matching connection. As a result, the ct system considers the packet to be the first one of a new connection((To be precise: The first one the ct system has //seen// from that connection. That does not necessarily mean that this always must be the actual very first packet of a new connection, because there might be cases where the ct system for whatever reason did not see the first few packets of an actual connection and kind-of starts tracking in the middle of an already existing connection.)). A new instance of '' |
+ | Further, | ||
Figure {{ref> | Figure {{ref> | ||
- | The very last thing that packet traverses before being received by a local socket is the //conntrack “help+confirm”// | + | The very last thing that packet traverses before being received by a local socket is the //conntrack “help+confirm”// |
+ | But even if a client, who is trying to establish e.g. a TCP connection by sending a TCP SYN packet, is behaving normally, it would still send out several TCP SYN packets as retransmissions if it does not receive any reply from the peer side. Thus, if you have a '' | ||
- | The third possibility is, that the ct system considers a packet as // | + | The third possibility is, that the ct system considers a packet as // |
However, it is not the job of the ct system to drop invalid packets((However, | However, it is not the job of the ct system to drop invalid packets((However, | ||
- | The fourth possibility is a means for other kernel components like Nftables to mark packets with a "do not track" bit((Actually that bit is named '' | + | The fourth possibility is a means for other kernel components like Nftables to mark packets with a "do not track" bit((Actually that bit is named '' |
Line 244: | Line 240: | ||
initialized as described in the sections above, a new connection is first added to | initialized as described in the sections above, a new connection is first added to | ||
the // | the // | ||
- | dropped before reaching the ct system' | + | dropped before reaching the ct system' |
connection is removed from the list and deleted. If the packet however passes | connection is removed from the list and deleted. If the packet however passes | ||
- | the // | + | the // |
- | and is marked as " | + | and is marked as " |
- | will be considered " | + | |
of time strongly depends on network protocol, state and traffic behavior of that | of time strongly depends on network protocol, state and traffic behavior of that | ||
connection. Once " | connection. Once " | ||
Line 280: | Line 275: | ||
* [[http:// | * [[http:// | ||
* [[http:// | * [[http:// | ||
+ | * [[https:// | ||
* [[https:// | * [[https:// | ||
* [[http:// | * [[http:// | ||
===== Continue with next article ===== | ===== Continue with next article ===== | ||
- | [[connection_tracking_2_core_implementation|Connection tracking - Part 2: Core Implementation]] | + | [[connection_tracking_2_core_implementation|Connection tracking |
+ | |||
+ | |||
+ | //published 2021-04-04//, | ||
blog/linux/connection_tracking_1_modules_and_hooks.1618986230.txt.gz · Last modified: 2021-04-21 by Andrej Stender