Thermalcircle.de

climbing the thermals

User Tools

Site Tools


blog:linux:connection_tracking_1_modules_and_hooks

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
blog:linux:connection_tracking_1_modules_and_hooks [2021-07-21] – explained connection confirm mechanism in more detail Andrej Stenderblog:linux:connection_tracking_1_modules_and_hooks [2023-08-15] (current) – improved/fixed some statements, typos Andrej Stender
Line 1: Line 1:
-{{tag>linux netfilter conntrack nftables iptables}}+{{tag>linux kernel netfilter conntrack nftables iptables}}
 ====== Connection tracking (conntrack) - Part 1: Modules and Hooks ====== ====== Connection tracking (conntrack) - Part 1: Modules and Hooks ======
 ~~META: ~~META:
 date created = 2021-04-04  date created = 2021-04-04 
 ~~ ~~
- 
-~~NOTOC~~ 
- 
  
 With this article series I like to take a closer look at the connection tracking subsystem of the Linux kernel, With this article series I like to take a closer look at the connection tracking subsystem of the Linux kernel,
Line 16: Line 13:
   * [[connection_tracking_1_modules_and_hooks|Connection tracking (conntrack) - Part 1: Modules and Hooks]]   * [[connection_tracking_1_modules_and_hooks|Connection tracking (conntrack) - Part 1: Modules and Hooks]]
   * [[connection_tracking_2_core_implementation|Connection tracking (conntrack) - Part 2: Core Implementation]]   * [[connection_tracking_2_core_implementation|Connection tracking (conntrack) - Part 2: Core Implementation]]
-  * Connection tracking (conntrack) - Part 3: Connection States and Examples (coming soon)+  * [[connection_tracking_3_state_and_examples|Connection tracking (conntrack) - Part 3: State and Examples]]
  
 ===== Overview ===== ===== Overview =====
 What is the purpose of connection tracking and what does it do? Once activated, connection tracking (the ct system inside the Linux kernel) examines IPv4 and/or IPv6 network packets and their payload, with the intention to determine which packets are associated with each other, e.g. in the scope of a connection-oriented protocol like TCP. The ct system performs this task as a transparent observer and does not take active part in the communication between endpoints. It is not relevant for the ct system, whether the endpoints of a connection are local or remote. They could be located on remote hosts, in which case the ct system would observe them while running on a host which merely is routing or bridging the packets of a particular connection. Alternatively, one or even both of the endpoints could be local sockets on the very same host where the ct system is running. It makes no difference. What is the purpose of connection tracking and what does it do? Once activated, connection tracking (the ct system inside the Linux kernel) examines IPv4 and/or IPv6 network packets and their payload, with the intention to determine which packets are associated with each other, e.g. in the scope of a connection-oriented protocol like TCP. The ct system performs this task as a transparent observer and does not take active part in the communication between endpoints. It is not relevant for the ct system, whether the endpoints of a connection are local or remote. They could be located on remote hosts, in which case the ct system would observe them while running on a host which merely is routing or bridging the packets of a particular connection. Alternatively, one or even both of the endpoints could be local sockets on the very same host where the ct system is running. It makes no difference.
-The ct system maintains an up-to-date (live) list of all tracked connections. Based on that it "categorizes" network packets while those are traversing the kernel network stack, by supplying each one with a reference (a pointer) to one of its tracked connection instances. As a result, other kernel components can access this connection association and make decisions based on that. The two most prominent candidates which make use of that are the NAT subsystem and the stateful packet filtering modules of Iptables and Nftables.+The ct system maintains an up-to-date (live) list of all tracked connections. Based on that it "categorizes" network packets while those are traversing the kernel network stack, by supplying each one with a reference (a pointer) to one of its tracked connection instances. As a result, other kernel components can access this connection association and make decisions based on that. The two most prominent candidates which make use of that are the NAT subsystem and the stateful packet filtering / stateful packet inspection (SPI) modules of Iptables and Nftables.
 The ct system itself does never alter/manipulate packets. It usually also never drops packets, however that can happen in certain rare cases. When inspecting packet content, its main focus is on OSI layers 3 and 4. It is able to track TCP, UDP, ICMP, ICMPv6, SCTP, DCCP and GRE connections. Obviously, the ct system’s definition of a “connection” is not limited to connection-oriented protocols, as several of the protocols just mentioned are not connection-oriented. It e.g. considers and handles an ICMP echo-request plus echo-reply (ping) as a “connection”. The ct system provides several helper/extension components, which extend its tracking abilities into application layer and e.g. track protocols like FTP, TFTP, IRC, PPTP, SIP, … Those are the basis for further use cases like [[wp>Application_Layer_Gateway|Application Layer Gateways]]. The ct system itself does never alter/manipulate packets. It usually also never drops packets, however that can happen in certain rare cases. When inspecting packet content, its main focus is on OSI layers 3 and 4. It is able to track TCP, UDP, ICMP, ICMPv6, SCTP, DCCP and GRE connections. Obviously, the ct system’s definition of a “connection” is not limited to connection-oriented protocols, as several of the protocols just mentioned are not connection-oriented. It e.g. considers and handles an ICMP echo-request plus echo-reply (ping) as a “connection”. The ct system provides several helper/extension components, which extend its tracking abilities into application layer and e.g. track protocols like FTP, TFTP, IRC, PPTP, SIP, … Those are the basis for further use cases like [[wp>Application_Layer_Gateway|Application Layer Gateways]].
  
Line 36: Line 33:
 nft add rule ip filter forward iif eth0 ct state established accept nft add rule ip filter forward iif eth0 ct state established accept
 </code> </code>
-<caption>Example, adding Nftables rules with //CONNTRACK EXPRESSIONS// to a fictive //forward// chain((Obviously you first would have to create that chain and a table ...+<caption>Example, adding Nftables rules with //CONNTRACK EXPRESSIONS// to a chain named //forward//((Obviously you first would have to create that chain and a table ...
 <code bash> <code bash>
 nft add table ip filter nft add table ip filter
Line 78: Line 75:
 ===== Netfilter hooks ===== ===== Netfilter hooks =====
 Like Iptables and Nftables, the ct system is built on top of the Netfilter framework. It implements hook functions to be able to observe network packets and registers those with the Netfilter hooks. Like Iptables and Nftables, the ct system is built on top of the Netfilter framework. It implements hook functions to be able to observe network packets and registers those with the Netfilter hooks.
-If you are not yet very familiar with Netfilter hooks, better first take a look at my other article [[nftables_packet_flow_netfilter_hooks_detail|Nftables - Packet flow and Netfilter hooks in detail]], before proceeding here. From the bird's eye view, the famous //Netfilter Packet Flow// image shown in Figure {{ref>nfpackflowofficial}} already gives a good hint on what is going on.+If you are not yet very familiar with Netfilter hooks, better first take a look at my other article [[nftables_packet_flow_netfilter_hooks_detail|Nftables - Packet flow and Netfilter hooks in detail]], before proceeding here. From the bird's eye view, the //Netfilter Packet Flow// image shown in Figure {{ref>nfpackflowofficial}}, which has been created by Netfilter developers and thereby can be considered official documentation, already gives a good hint on what is going on. 
  
 <figure nfpackflowofficial> <figure nfpackflowofficial>
Line 88: Line 85:
  
 ===== Module nf_conntrack ===== ===== Module nf_conntrack =====
-Let's get back to the example above and take a look at the kernel module of the ct system itelf. When the first Nftables rule containing a //CONNTRACK EXPRESSION// is being added to the ruleset of your current network namespace, the Nftables code (indirectly) triggers loading of kernel module ''nf_conntrack'' as described above, if not already loaded. After that, the Nftables code calls+Let's get back to the example above and take a look at the kernel module of the ct system itself. When the first Nftables rule containing a //CONNTRACK EXPRESSION// is being added to the ruleset of your current network namespace, the Nftables code (indirectly) triggers loading of kernel module ''nf_conntrack'' as described above, if not already loaded. After that, the Nftables code calls
  ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_proto.c#L583|nf_ct_netns_get()]]''. This is a function which is exported (=provided) by the just loaded ''nf_conntrack'' module. When called, it registers the hook functions of the ct system with the Netfilter hooks of the current network namespace.  ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_proto.c#L583|nf_ct_netns_get()]]''. This is a function which is exported (=provided) by the just loaded ''nf_conntrack'' module. When called, it registers the hook functions of the ct system with the Netfilter hooks of the current network namespace.
-The Nftables rules shown in the example above specify //address family// ''ip''. Thus, in that case the ct system registers the four hook functions shown in Figure {{ref>nfcthooks1}} with the IPv4 Netfilter hooks. In case of //address family// ''ip6'', the ct system instead would register the same four hook functions with the Netfilter hooks of IPv6. In case of //address family// ''inet'', it would register its hook functions with both the IPv4 and the IPv6 Netfilter hooks.+The Nftables rules shown in Figure {{ref>nftctex1}} specify //address family// ''ip''. Thus, in that case the ct system registers the four hook functions shown in Figure {{ref>nfcthooks1}} with the IPv4 Netfilter hooks. In case of //address family// ''ip6'', the ct system instead would register the same four hook functions with the Netfilter hooks of IPv6. In case of //address family// ''inet'', it would register its hook functions with both the IPv4 and the IPv6 Netfilter hooks.
  
 <figure nfcthooks1> <figure nfcthooks1>
Line 103: Line 100:
  
 ==== The main ct hook functions ==== ==== The main ct hook functions ====
-The two hook functions which get registered with priority -200 in the //Prerouting// hook and in the //Output// hook in Figure {{ref>nfcthooks1}} are the very same //conntrack// hook functions shown in the official //Netfilter Packet Flow image// in Figure {{ref>nfpackflowofficial}}. Internally, both of them (nearly) do the same thing. Wrapped by some outer functions which do slightly different things, the major function called by both of them is ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1793|nf_conntrack_in()]]''. Thus, the major difference between both merely is their placement... the one in the //Prerouting// hook handles packets received on the network while the one in the //Output// hook handles outgoing packets generated on this host. These two can be considered the "main" hook functions of the ct system, because most of what the ct system does with traversing network packets happens inside them... analyzing and associating packets with tracked connections, then supplying those packets with a reference (pointer) to tracked connection instances... . I'll elaborate in more detail on that in the sections below.+The two hook functions which get registered with priority -200 in the //Prerouting// hook and in the //Output// hook in Figure {{ref>nfcthooks1}} are the very same //conntrack// hook functions shown in the official //Netfilter Packet Flow image// in Figure {{ref>nfpackflowofficial}}. Internally, both of them (nearly) do the same thing. Wrapped by some outer functions which do slightly different things, the major function called by both of them is ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1793|nf_conntrack_in()]]''. Thus, the major difference between both merely is their placement... the one in the //Prerouting// hook handles packets received on the network while the one in the //Output// hook handles outgoing packets generated on this host. These two can be considered the "main" hook functions of the ct system, because most of what the ct system does with traversing network packets happens inside them... analyzing and associating packets with tracked connections, then supplying those packets with a reference (pointer) to tracked connection instances... . I'll elaborate on that in more detail in the sections below.
  
  
Line 123: Line 120:
 specific use cases and I won't cover that topic in the scope of this first specific use cases and I won't cover that topic in the scope of this first
 article. The second is to "confirm" new tracked connections; see ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1073|__nf_conntrack_confirm()]]''. article. The second is to "confirm" new tracked connections; see ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1073|__nf_conntrack_confirm()]]''.
-I'll elaborate in the sections below on what that means.+I'll elaborate on what that means in the sections below.
  
 <WRAP round info> <WRAP round info>
Line 139: Line 136:
  
 ===== Modules nf_defrag_ipv4/6 ===== ===== Modules nf_defrag_ipv4/6 =====
-As shown above, module ''nf_conntrack'' depends on the modules ''nf_defrag_ipv4'' and ''nf_defrag_ipv6''. What is important to know here is, that those take care about re-assembling (=defragment) IPv4 and IPv6 fragments respectively, if those occur. Usually, defragmentation is supposed to happen at the receiving communication endpoint and not along the way through the hops between both endpoints. However, in this case it is necessary.  Connection tracking can only do its job, if ALL packets of a connection can be identified and no packet can slip through the fingers of the ct system. The problem with fragments is, that they do not all contain the necessary protocol header information which would be required to identify and associate them with a tracked connection.+As shown in Figure {{ref>nft_ct_depends}}, module ''nf_conntrack'' depends on the modules ''nf_defrag_ipv4'' and ''nf_defrag_ipv6''. What is important to know here is, that those take care about re-assembling (=defragment) IPv4 and IPv6 fragments respectively, if those occur. Usually, defragmentation is supposed to happen at the receiving communication endpoint and not along the way through the hops between both endpoints. However, in this case it is necessary.  Connection tracking can only do its job, if ALL packets of a connection can be identified and no packet can slip through the fingers of the ct system. The problem with fragments is, that they do not all contain the necessary protocol header information required to identify and associate them with a tracked connection.
  
 <figure nfdefraghooks1> <figure nfdefraghooks1>
Line 148: Line 145:
  
 Like the ct system itself, those defrag modules do not become globally active on module load. They export (=provide) functions ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/ipv4/netfilter/nf_defrag_ipv4.c#L130|nf_defrag_ipv4_enable()]]'' and ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c#L131|nf_defrag_ipv6_enable()]]'' respectively, which register their own hook function with the Netfilter hooks.  Like the ct system itself, those defrag modules do not become globally active on module load. They export (=provide) functions ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/ipv4/netfilter/nf_defrag_ipv4.c#L130|nf_defrag_ipv4_enable()]]'' and ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c#L131|nf_defrag_ipv6_enable()]]'' respectively, which register their own hook function with the Netfilter hooks. 
-Figure {{ref>nfdefraghooks1}} shows this for module ''nf_defrag_ipv4'' and the IPv4 Netfilter hooks: Internally this module provides function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/ipv4/netfilter/nf_defrag_ipv4.c#L61|ipv4_conntrack_defrag()]]'' to handle defragmentation of traverssing network packets.+Figure {{ref>nfdefraghooks1}} shows this for module ''nf_defrag_ipv4'' and the IPv4 Netfilter hooks: Internally this module provides function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/ipv4/netfilter/nf_defrag_ipv4.c#L61|ipv4_conntrack_defrag()]]'' to handle defragmentation of traversing network packets.
 This function is being registered as a hook function with the Netfilter //Prerouting// hook and also with the Netfilter //Output// hook. In both those places it is being registered with priority -400, which ensures that packets traverse it BEFORE traversing the //conntrack// hook functions, which are registered with priority -200.  This function is being registered as a hook function with the Netfilter //Prerouting// hook and also with the Netfilter //Output// hook. In both those places it is being registered with priority -400, which ensures that packets traverse it BEFORE traversing the //conntrack// hook functions, which are registered with priority -200. 
-The ct system's function ''nf_ct_netns_get()'' mentioned in the section above does call ''nf_defrag_ipv4_enable()'' and/or its IPv6 counterpart respectively, before registering the ct system's hook functions. Thus, the //defrag// hook function gets registered together with the //conntrack// hook functions. However, no reference counting is implemented here, which means, once this hook function is registered, it stays registered (until someone explicitly removes/unloads the kernel module).+The ct system's function ''nf_ct_netns_get()'' mentioned in the section above does call ''nf_defrag_ipv4_enable()'' and/or its IPv6 counterpart respectively, before registering the ct system's hook functions. Thus, the //defrag// hook functions get registered together with the //conntrack// hook functions. However, no reference counting is implemented here, which means, once this hook function is registered, it stays registered (until someone explicitly removes/unloads the kernel module).
  
 ===== Hooks Summary ===== ===== Hooks Summary =====
Line 182: Line 179:
 ===== How it works... ===== ===== How it works... =====
 I know... so far I kept beating around the bush. Now let's finally talk about how the ct system actually operates and what it does to network packets traversing its hook functions. Please be aware that what I describe in this section are the basics and does not cover all what the ct system actually does. The ct system maintains the connections which it is tracking in a central table. Each tracked connection is represented by an instance of ''[[https://elixir.bootlin.com/linux/v5.10.19/source/include/net/netfilter/nf_conntrack.h#L58|struct nf_conn]]''. That structure contains all necessary details the ct system learns about the connection over time while tracking it. From the ct system's point of view, every network packet which traverses one of its main hook functions (those with priority -200) is one of four possible things: I know... so far I kept beating around the bush. Now let's finally talk about how the ct system actually operates and what it does to network packets traversing its hook functions. Please be aware that what I describe in this section are the basics and does not cover all what the ct system actually does. The ct system maintains the connections which it is tracking in a central table. Each tracked connection is represented by an instance of ''[[https://elixir.bootlin.com/linux/v5.10.19/source/include/net/netfilter/nf_conntrack.h#L58|struct nf_conn]]''. That structure contains all necessary details the ct system learns about the connection over time while tracking it. From the ct system's point of view, every network packet which traverses one of its main hook functions (those with priority -200) is one of four possible things:
-  - It is either is part of or related to one of its tracked connections. +  - It is either part of or related to one of its tracked connections. 
-  - It is the first packet of a new connection which is not yet tracked.+  - It is the first seen packet of a connection which is not yet tracked.
   - It is an invalid packet, which is broken or doesn't fit in somehow.   - It is an invalid packet, which is broken or doesn't fit in somehow.
   - It is marked as NOTRACK, which tells the ct system to ignore it.   - It is marked as NOTRACK, which tells the ct system to ignore it.
Line 203: Line 200:
  
 Figure {{ref>nfct-lookup}} shows an example of the first possibility, an incoming packet Figure {{ref>nfct-lookup}} shows an example of the first possibility, an incoming packet
-being part of an already tracked connection. When that packet traverses the main //conntrack// hook function (the one with priority -200), the ct system first performs some initial validity checks on it. If the packet passes those, the ct system then does a lookup into its central table to find the potentially matching connection. In this case a match is found and the packet is provided with a pointer to the matching tracked connection instance. For this purpose, the ''[[https://elixir.bootlin.com/linux/v5.10.19/source/include/linux/skbuff.h#L713|skb]]''((Network packets are represented as instances of ''struct sk_buff'' within the Linux kernel network stack. This struct is often referred to as "socket buffer" or "skb".)) of each packet possesses  member variable ''[[https://elixir.bootlin.com/linux/v5.10.19/source/include/linux/skbuff.h#L759|_nfct]]''((I intentionally omit a tiny detail here: The data type of ''_nfct'' actually is not ''struct nf_conn *'', but instead is ''unsigned long''. Actually the 3 least significant bits of that integer are used in a special way (used for ''ctinfo'') and are not used as a pointer. The remaining bits are used as a pointer to ''struct nf_conn''. This is just a messy implementation detail, which you can ignore for now. I'll get back to it in a later article.)). This means, the network packet thereby is kind-of being "marked" or "categorized" by the ct system. +being part of an already tracked connection. When that packet traverses the main //conntrack// hook function (the one with priority -200), the ct system first performs some initial validity checks on it. If the packet passes those, the ct system then does a lookup into its central table to find the potentially matching connection. In this casea match is found and the packet is provided with a pointer to the matching tracked connection instance. For this purpose, the ''[[https://elixir.bootlin.com/linux/v5.10.19/source/include/linux/skbuff.h#L713|skb]]''((Network packets are represented as instances of ''struct sk_buff'' within the Linux kernel network stack. This struct is often referred to as "socket buffer" or "skb".)) of each packet possesses  member variable ''[[https://elixir.bootlin.com/linux/v5.10.19/source/include/linux/skbuff.h#L759|_nfct]]''((I intentionally omit a tiny detail here: The data type of ''_nfct'' actually is not ''struct nf_conn *'', but instead is ''unsigned long''. Actually the 3 least significant bits of that integer are used in a special way (used for ''ctinfo'') and are not used as a pointer. The remaining bits are used as a pointer to ''struct nf_conn''. This is just a messy implementation detail, which you can ignore for now. I'll get back to it in a later article.)). This means, the network packet thereby is kind-of being "marked" or "categorized" by the ct system. 
 Further, the OSI layer 4 protocol of the packet is now being analyzed and latest protocol state and details are saved to its tracked connection instance. Then the packet continues on its way through other hook functions and the network stack. Other kernel components, like Nftables with //CONNTRACK EXPRESSION// rules, can now without further ct table lookup obtain connection information about the packet, by simply dereferencing the ''%%skb->_nfct%%'' pointer. This is shown in Figure {{ref>nfct-lookup}} in form of an example Nftables chain with priority 0 in the //Prerouting// hook. If you would place a rule with expression ''ct state established'' in that chain, that rule would match. The very last thing that packet traverses before being received by a local socket is the //conntrack "help+confirm"// hook function in the //Input// hook. Nothing happens to the packet here. That hook function is targeted at other cases. Further, the OSI layer 4 protocol of the packet is now being analyzed and latest protocol state and details are saved to its tracked connection instance. Then the packet continues on its way through other hook functions and the network stack. Other kernel components, like Nftables with //CONNTRACK EXPRESSION// rules, can now without further ct table lookup obtain connection information about the packet, by simply dereferencing the ''%%skb->_nfct%%'' pointer. This is shown in Figure {{ref>nfct-lookup}} in form of an example Nftables chain with priority 0 in the //Prerouting// hook. If you would place a rule with expression ''ct state established'' in that chain, that rule would match. The very last thing that packet traverses before being received by a local socket is the //conntrack "help+confirm"// hook function in the //Input// hook. Nothing happens to the packet here. That hook function is targeted at other cases.
  
Line 219: Line 216:
 being the first one representing a new connection which is not yet tracked by the ct system. being the first one representing a new connection which is not yet tracked by the ct system.
 When that packet traverses the main //conntrack// hook function (the one with priority -200), let's assume that When that packet traverses the main //conntrack// hook function (the one with priority -200), let's assume that
-it passes the already mentioned validity checks. However, in this case the lookup in the ct table (1) does not find a matching connection. As a result, the ct system considers the packet to be the first one of a new connection. A new instance of ''struct nf_conn'' is created (2) and member ''%%skb->_nfct%%'' of the packet is initialized to point to that instance. The ct system considers the new connection as "unconfirmed" at this point. Thus, the new connection instance is not yet being added to the central table. It is temporarily parked on the so-called //unconfirmed list//+it passes the already mentioned validity checks. However, in this case the lookup in the ct table (1) does not find a matching connection. As a result, the ct system considers the packet to be the first one of a new connection((To be precise: The first one the ct system has //seen// from that connection. That does not necessarily mean that this always must be the actual very first packet of a new connection, because there might be cases where the ct system for whatever reason did not see the first few packets of an actual connection and kind-of starts tracking in the middle of an already existing connection.)). A new instance of ''struct nf_conn'' is created (2) and member ''%%skb->_nfct%%'' of the packet is initialized to point to that instance. The ct system considers the new connection as "unconfirmed" at this point. Thus, the new connection instance is not yet being added to the central table. It is temporarily parked on the so-called //unconfirmed list//
 Further, the OSI layer 4 protocol of the packet is now being analyzed and protocol state and details are saved to its tracked connection instance. Then the packet continues on its way through other hook functions and the network stack.  Further, the OSI layer 4 protocol of the packet is now being analyzed and protocol state and details are saved to its tracked connection instance. Then the packet continues on its way through other hook functions and the network stack. 
 Figure {{ref>nfct-new}} also shows an example Nftables chain with priority 0 in the //Prerouting// hook. If you would place a rule with expression ''ct state new'' in that chain, it would match. Figure {{ref>nfct-new}} also shows an example Nftables chain with priority 0 in the //Prerouting// hook. If you would place a rule with expression ''ct state new'' in that chain, it would match.
Line 225: Line 222:
 But even if a client, who is trying to establish e.g. a TCP connection by sending a TCP SYN packet, is behaving normally, it would still send out several TCP SYN packets as retransmissions if it does not receive any reply from the peer side. Thus, if you have a ''ct state new drop'' rule in place, this mechanism ensures, that the ct system intentionally does not remember this (denied!) connection and thereby treats all succeeding TCP SYN packets (retransmissions) again as new packets and those then will be dropped by the same ''ct state new drop'' rule. But even if a client, who is trying to establish e.g. a TCP connection by sending a TCP SYN packet, is behaving normally, it would still send out several TCP SYN packets as retransmissions if it does not receive any reply from the peer side. Thus, if you have a ''ct state new drop'' rule in place, this mechanism ensures, that the ct system intentionally does not remember this (denied!) connection and thereby treats all succeeding TCP SYN packets (retransmissions) again as new packets and those then will be dropped by the same ''ct state new drop'' rule.
  
-The third possibility is, that the ct system considers a packet as //invalid//. This e.g. happens, when a packet does not pass the mentioned initial validity checks of the main conntrack hook function in the //Prerouting// hook in Figure {{ref>nfct-lookup}} or {{ref>nfct-new}}, e.g. because of a broken or incomplete protocol header which cannot not be parsed. This further can happen when a packet fails the detailed analysis of its OSI layer 4 protocol. E.g. in case of TCP the ct system observes receive window and sequence numbers and a packet which does not match regarding its sequence numbers would be considered //invalid//.+The third possibility is, that the ct system considers a packet as //invalid//. This e.g. happens, when a packet does not pass the mentioned initial validity checks of the main conntrack hook function in the //Prerouting// hook in Figure {{ref>nfct-lookup}} or {{ref>nfct-new}}, e.g. because of a broken or incomplete protocol header which cannot be parsed. This further can happen when a packet fails the detailed analysis of its OSI layer 4 protocol. E.g. in case of TCP the ct system observes receive window and sequence numbers and a packet which does not match regarding its sequence numbers would be considered //invalid//.
 However, it is not the job of the ct system to drop invalid packets((However, there are a few rare cases, like an overflow of the ct table, where it indeed drops packets.)). The ct system leaves that decision to other parts of the kernel network stack. If it considers a packet as //invalid//, the ct system simply leaves ''%%skb->_nfct=NULL%%''. If you would place an Nftables rule with expression ''ct state invalid'' in the example chain in Figure {{ref>nfct-lookup}} or {{ref>nfct-new}}, then that rule would match. However, it is not the job of the ct system to drop invalid packets((However, there are a few rare cases, like an overflow of the ct table, where it indeed drops packets.)). The ct system leaves that decision to other parts of the kernel network stack. If it considers a packet as //invalid//, the ct system simply leaves ''%%skb->_nfct=NULL%%''. If you would place an Nftables rule with expression ''ct state invalid'' in the example chain in Figure {{ref>nfct-lookup}} or {{ref>nfct-new}}, then that rule would match.
  
Line 278: Line 275:
   * [[http://www.netfilter.org/documentation/HOWTO/netfilter-hacking-HOWTO.html|Linux netfilter Hacking HOWTO (Rusty Russell and Harald Welte, 2002)]]   * [[http://www.netfilter.org/documentation/HOWTO/netfilter-hacking-HOWTO.html|Linux netfilter Hacking HOWTO (Rusty Russell and Harald Welte, 2002)]]
   * [[http://people.netfilter.org/pablo/docs/login.pdf|Netfilter’s connection tracking system (Pablo Neira Ayuso, 2006)]]   * [[http://people.netfilter.org/pablo/docs/login.pdf|Netfilter’s connection tracking system (Pablo Neira Ayuso, 2006)]]
 +  * [[https://www.frozentux.net/iptables-tutorial/iptables-tutorial.html#STATEMACHINE|Iptables tutorial 1.2.2: Chapter 7. The state machine (Oskar Andreasson, 2006)]]
   * [[https://wiki.aalto.fi/download/attachments/70789072/netfilter-paper-final.pdf|Netfilter Connection Tracking and NAT Implementation (Magnus Boye, 2012)]]   * [[https://wiki.aalto.fi/download/attachments/70789072/netfilter-paper-final.pdf|Netfilter Connection Tracking and NAT Implementation (Magnus Boye, 2012)]]
   * [[http://arthurchiao.art/blog/conntrack-design-and-implementation/|Connection Tracking: Design and Implementation Inside Linux Kernel (Arthur Chiao, 2020)]]   * [[http://arthurchiao.art/blog/conntrack-design-and-implementation/|Connection Tracking: Design and Implementation Inside Linux Kernel (Arthur Chiao, 2020)]]
Line 285: Line 283:
  
  
-//published 2021-04-04//, //last modified 2021-07-17//+//published 2021-04-04//, //last modified 2023-08-15//
  
blog/linux/connection_tracking_1_modules_and_hooks.1626886902.txt.gz · Last modified: 2021-07-21 by Andrej Stender