Thermalcircle.de

climbing the thermals

User Tools

Site Tools


blog:linux:connection_tracking_2_core_implementation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
blog:linux:connection_tracking_2_core_implementation [2021-06-29] – moved publishing date to end of article Andrej Stenderblog:linux:connection_tracking_2_core_implementation [2022-08-07] (current) – activated TOC Andrej Stender
Line 1: Line 1:
-{{tag>linux netfilter conntrack nftables iptables}}+{{tag>linux kernel netfilter conntrack nftables iptables}}
 ====== Connection tracking (conntrack) - Part 2: Core Implementation ====== ====== Connection tracking (conntrack) - Part 2: Core Implementation ======
 ~~META: ~~META:
 date created = 2021-04-11  date created = 2021-04-11 
 ~~ ~~
- 
-~~NOTOC~~ 
- 
  
 With this article series I like to take a closer look at the connection tracking subsystem of the Linux kernel, which provides the basis for features like stateful packet filtering and NAT. With this article series I like to take a closer look at the connection tracking subsystem of the Linux kernel, which provides the basis for features like stateful packet filtering and NAT.
Line 18: Line 15:
   * [[connection_tracking_1_modules_and_hooks|Connection tracking (conntrack) - Part 1: Modules and Hooks]]   * [[connection_tracking_1_modules_and_hooks|Connection tracking (conntrack) - Part 1: Modules and Hooks]]
   * [[connection_tracking_2_core_implementation|Connection tracking (conntrack) - Part 2: Core Implementation]]   * [[connection_tracking_2_core_implementation|Connection tracking (conntrack) - Part 2: Core Implementation]]
-  * Connection tracking (conntrack) - Part 3: Connection States and Examples (coming soon)+  * [[connection_tracking_3_state_and_examples|Connection tracking (conntrack) - Part 3: State and Examples]]
  
 ===== The ct table ===== ===== The ct table =====
Line 75: Line 72:
  
 ===== Lookup existing connection ===== ===== Lookup existing connection =====
-Let's walk through the connection lookup in detail. In Figure {{ref>nfctlookup1}} a TCP packet is traversing the ct callback in the Netfilter //Prerouting// hook((or the //Output// hook, it doesn't matter right now, because the same code is executed in both hooks)). Most of the interesting part of that hook callback is done in function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1793|nf_conntrack_in()]]''. Among other functions, it calls  ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1649|resolve_normal_ct()]]'' and this is where the lookup happens. Figure {{ref>nfctlookup1}} shows that in detail. +Let's walk through the connection lookup in detail. In Figure {{ref>nfctlookup1}} a TCP packet is traversing the ct hook function in the Netfilter //Prerouting// hook((or the one in the //Output// hook, it doesn't matter right now, because the same code is executed in both hooks)). Most of the interesting part of that hook function is done in function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1793|nf_conntrack_in()]]''. Among other functions, it calls  ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1649|resolve_normal_ct()]]'' and this is where the lookup happens. Figure {{ref>nfctlookup1}} shows that in detail. 
 In this example I assume that the connection the TCP packet belongs to is already known and tracked by the ct system at this point. In other words, I assume that this is not the first packet of that connection which the ct system is seeing. In this example I assume that the connection the TCP packet belongs to is already known and tracked by the ct system at this point. In other words, I assume that this is not the first packet of that connection which the ct system is seeing.
  
Line 98: Line 95:
 As you can see, it is ''tuplehash[0]'' (green) which matches in step (5). As you can see, it is ''tuplehash[0]'' (green) which matches in step (5).
 This means the TCP packet in question is part of the //original// direction of this tracked connection. This means the TCP packet in question is part of the //original// direction of this tracked connection.
-In step (6) function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/include/net/netfilter/nf_conntrack.h#L329|nf_ct_set()]]'' is called to initialize ''%%skb->_nfct%%'' of the network packet to point to the just found matching instance of ''struct nf_conn''((Also ''ctinfo'' is set here, but that's a topic I'll cover in another article.)). Finally now the OSI layer 4 protocol of the packet (in our example TCP) is being examined((That is done in function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1748|nf_conntrack_handle_packet()]]'' and it is a topic which deserves explanation in much more detail. If I find the time, I'll cover it in a future article.)), before the packet finishes traversing the ct hook callback.+In step (6) function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/include/net/netfilter/nf_conntrack.h#L329|nf_ct_set()]]'' is called to initialize ''%%skb->_nfct%%'' of the network packet to point to the just found matching instance of ''struct nf_conn''((Also ''ctinfo'' is set here, but that's a topic I'll cover in another article.)). Finally now the OSI layer 4 protocol of the packet (in our example TCP) is being examined((That is done in function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1748|nf_conntrack_handle_packet()]]'' and it is a topic which deserves explanation in much more detail. If I find the time, I'll cover it in a future article.)), before the packet finishes traversing the ct hook function.
  
 ===== Adding a new connection ===== ===== Adding a new connection =====
Line 129: Line 126:
 Step (5) is the exact same thing as step (6) in Figure {{ref>nfctlookup1}} during the lookup, Step (5) is the exact same thing as step (6) in Figure {{ref>nfctlookup1}} during the lookup,
 initializing ''%%skb->_nfct%%'' of the network packet to point to the new connection instance. initializing ''%%skb->_nfct%%'' of the network packet to point to the new connection instance.
-Finally now the OSI layer 4 protocol of the packet (in our example TCP) is being examined((That is done in function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1748|nf_conntrack_handle_packet()]]'' and it is a topic which deserves explanation in much more detail. If I find the time, I'll cover it in a future article.)), before the packet finishes traversing the ct hook callback. In this example here I assume that the network packet is not being dropped while it continues on its way through the kernel network stack and through one or more potential Nftables chains and rules. Finally, the packet traverses one of the //conntrack "help+confirm"// callbacks (the ones with priority MAX). Inside it, the new connection will get "confirmed" and be added to the actual ct hash table. This is shown in detail in Figure {{ref>nfctadd2}}.+Finally now the OSI layer 4 protocol of the packet (in our example TCP) is being examined((That is done in function ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/netfilter/nf_conntrack_core.c#L1748|nf_conntrack_handle_packet()]]'' and it is a topic which deserves explanation in much more detail. If I find the time, I'll cover it in a future article.)), before the packet finishes traversing the ct hook function. In this example here I assume that the network packet is not being dropped while it continues on its way through the kernel network stack and through one or more potential Nftables chains and rules. Finally, the packet traverses one of the //conntrack "help+confirm"// functions (the ones with priority MAX). Inside it, the new connection will get "confirmed" and be added to the actual ct hash table. This is shown in detail in Figure {{ref>nfctadd2}}.
  
 <figure nfctadd2> <figure nfctadd2>
Line 222: Line 219:
 deletion can occur if the network packet which triggered deletion can occur if the network packet which triggered
 its creation is dropped before it reaches the //conntrack its creation is dropped before it reaches the //conntrack
-"help+confirm"// hook callback. Dropping the packet means+"help+confirm"// hook function. Dropping the packet means
 the ''skb'' is being deleted/freed. Function  ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/core/skbuff.c#L655|skb_release_head_state()]]'' is part of this the ''skb'' is being deleted/freed. Function  ''[[https://elixir.bootlin.com/linux/v5.10.19/source/net/core/skbuff.c#L655|skb_release_head_state()]]'' is part of this
 deletion and it calls ''nf_conntrack_put()'', which decrements deletion and it calls ''nf_conntrack_put()'', which decrements
Line 238: Line 235:
 is required for "unconfirmed" connections, because creation of those is triggered is required for "unconfirmed" connections, because creation of those is triggered
 by a network packet and they either become "confirmed" while that same packet is by a network packet and they either become "confirmed" while that same packet is
-still traversing the kernel network stack or they die together that same packet+still traversing the kernel network stack or they die together with that same packet
 when it is being dropped.)). This means, usually each further network packet when it is being dropped.)). This means, usually each further network packet
-traversing the main ct hook callbacks which is identified to belong to a tracked+traversing the main ct hook functions which is identified to belong to a tracked
 connection (=for which the lookup in the ct table finds a match), connection (=for which the lookup in the ct table finds a match),
 will cause the timeout of that connection to be resetted/restarted. will cause the timeout of that connection to be resetted/restarted.
Line 290: Line 287:
 But when and how often does the ct system actually check each tracked But when and how often does the ct system actually check each tracked
 connection for expiration? Nearly all what I described so far happens within connection for expiration? Nearly all what I described so far happens within
-the ct system's hook callbacks when network packets traverse those. The idea+the ct system's hook functions when network packets traverse those. The idea
 of the timeout however is to make a tracked connection expire, if no further of the timeout however is to make a tracked connection expire, if no further
 traffic is detected for some time. Obviously that expiration checking traffic is detected for some time. Obviously that expiration checking
-cannot be done in the hook callbacks.+cannot be done in the hook functions.
 The ct system uses the //workqueue// mechanism of the kernel The ct system uses the //workqueue// mechanism of the kernel
 to run the garbage collecting function to run the garbage collecting function
Line 339: Line 336:
 [[:feedback|Feedback]] to this article is very welcome! Please be aware that I'm not one of the developers of the ct system. I'm merely some developer who took a look at the source code and did some practical experimenting. If you find something which I might have misunderstood or described incorrectly here, then I would be very grateful, [[:feedback|Feedback]] to this article is very welcome! Please be aware that I'm not one of the developers of the ct system. I'm merely some developer who took a look at the source code and did some practical experimenting. If you find something which I might have misunderstood or described incorrectly here, then I would be very grateful,
 if you bring this to my attention and of course I'll then fix my content asap accordingly. if you bring this to my attention and of course I'll then fix my content asap accordingly.
 +
 +===== References =====
 +  * [[https://www.netfilter.org/documentation/HOWTO/netfilter-hacking-HOWTO-4.html#ss4.3|Linux netfilter Hacking HOWTO: 4.3ff (Rusty Russell and Harald Welte, 2002)]]
 +  * [[http://people.netfilter.org/pablo/docs/login.pdf|Netfilter’s connection tracking system (Pablo Neira Ayuso, 2006)]]
 +  * [[https://wiki.aalto.fi/download/attachments/70789072/netfilter-paper-final.pdf|Netfilter Connection Tracking and NAT Implementation (Magnus Boye, 2012)]]
 +  * [[http://arthurchiao.art/blog/conntrack-design-and-implementation/|Connection Tracking: Design and Implementation Inside Linux Kernel (Arthur Chiao, 2020)]]
  
 ===== Continue with next article ===== ===== Continue with next article =====
-A third article is currently in the works. I'll place a link here once its finished. +[[connection_tracking_3_state_and_examples|Connection tracking (conntrack) - Part 3: State and Examples]]
-In that article, I plan to take a look at the set of states a tracked connection lives through during its life cycle and in which way Nftables rules make use of that. I'll further present practical examples which show the life cycle and state changes of tracked connections of common protocols like ICMP, TCP and UDP. +
  
  
-//published 2021-04-11//+//published 2021-04-11//, //last modified 2022-08-07//
  
blog/linux/connection_tracking_2_core_implementation.1624917763.txt.gz · Last modified: 2021-06-29 by Andrej Stender