Thermalcircle.de

climbing the thermals

User Tools

Site Tools


blog:linux:nftables_ipsec_packet_flow

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
blog:linux:nftables_ipsec_packet_flow [2022-01-30] – Added link to new follow-up article. Andrej Stenderblog:linux:nftables_ipsec_packet_flow [2022-08-14] (current) – added details about xfrm bundle Andrej Stender
Line 1: Line 1:
-{{tag>linux netfilter nftables ipsec strongswan charon swanctl xfrm}}+{{tag>linux kernel netfilter nftables ipsec strongswan charon swanctl xfrm}}
 ====== Nftables - Netfilter and VPN/IPsec packet flow ====== ====== Nftables - Netfilter and VPN/IPsec packet flow ======
 ~~META: ~~META:
 date created = 2020-05-30 date created = 2020-05-30
 ~~ ~~
- 
-~~NOTOC~~ 
  
 In this article I like to explain how the packet flow through  In this article I like to explain how the packet flow through 
Line 147: Line 145:
  
 | <WRAP>{{:linux:routing-step.png?nolink |}} The routing lookup is performed for incoming as well as local outgoing packets, see Figure {{ref>nfhooksxfrm1}}. Function ''[[https://elixir.bootlin.com/linux/v5.10.46/source/include/net/ip_fib.h#L363|fib_lookup()]]'' performs the actual lookup into the policy routing rules and routing tables. The routing decision resulting from this lookup is attached to the traversing network packet (skb). | <WRAP>{{:linux:routing-step.png?nolink |}} The routing lookup is performed for incoming as well as local outgoing packets, see Figure {{ref>nfhooksxfrm1}}. Function ''[[https://elixir.bootlin.com/linux/v5.10.46/source/include/net/ip_fib.h#L363|fib_lookup()]]'' performs the actual lookup into the policy routing rules and routing tables. The routing decision resulting from this lookup is attached to the traversing network packet (skb).
-It is an instance of two combined structs, the outer ''[[https://elixir.bootlin.com/linux/v5.10.46/source/include/net/route.h#L49|struct rtable]]'' and the inner ''[[https://elixir.bootlin.com/linux/v5.10.46/source/include/net/dst.h#L24|struct dst_entry]]'', Together, both contain information like the output network interface, the ip address of the next hop gateway (if existing), function pointers which determine the path this packet takes through the remaining part of the kernel network stack, and more. At first glance, routing has nothing to do with the Xfrm framework, but it is relevant in this context as you will see below.</WRAP>+It is an instance of two combined structs, the outer ''[[https://elixir.bootlin.com/linux/v5.10.46/source/include/net/route.h#L49|struct rtable]]'' and the inner ''[[https://elixir.bootlin.com/linux/v5.10.46/source/include/net/dst.h#L24|struct dst_entry]]'', Together, both contain information like the output network interface, the ip address of the next hop gateway (if existing), function pointers which determine the path this packet takes through the remaining part of the kernel network stack, and more. My [[routing_decisions_in_the_linux_kernel_1_lookup_packet_flow|article series on routing]] explains that in detail. At first glance, routing has nothing to do with the Xfrm framework, but it is relevant in this context as you will see below.</WRAP>
-| <WRAP>{{:linux:xfrm-action-policy-out.png?nolink |}} This action is performed for forwarded as well as for local outgoing packets after the routing lookup, see Figure {{ref>nfhooksxfrm1}} and function ''[[https://elixir.bootlin.com/linux/v5.10.46/source/net/xfrm/xfrm_policy.c#L3183|xfrm_lookup()]]''. The Xfrm framework performs a lookup into the IPsec SPD, searching for a matching output policy (''dir out'' SP). If no matching policy is found, the network packet stays unchanged and simply continues on its way. If a matching policy is found, a lookup into the SAD is performed to resolve an SA which corresponds to the matching SP (shown as an attached magenta box named //Xfrm lookup state//). If the resolved SA specifies tunnel-mode, then yet another routing lookup is performed, this time for the (future) outer IPv4 packet which will later encapsulate the current packet. The actual packet transformation does not yet happen at this point. Instead, a "bundle" of transformation instructions for this packet is assembled. The term "bundle" stems from the kernel source code and refers to a bunch of struct instances pointing to each other. Among those are the original routing decision of this packet, the SP, the SA((there can be more than one SA being applied to a packet, but that is a less common case)), the routing decision for the future outer IP packet and more. Those are usually assembled around one or several instances of ''[[https://elixir.bootlin.com/linux/v5.10.46/source/include/net/xfrm.h#L925|struct xfrm_dst]]''. The bundle is attached to the network packet (skb), replacing the originally attached routing decision. Function pointers within the bundle ensure, that the packet takes a different path through the remaining part of the kernel network stack.</WRAP> |+| <WRAP>{{:linux:xfrm-action-policy-out.png?nolink |}} This action is performed for forwarded as well as for local outgoing packets after the routing lookup, see Figure {{ref>nfhooksxfrm1}} and function ''[[https://elixir.bootlin.com/linux/v5.10.46/source/net/xfrm/xfrm_policy.c#L3183|xfrm_lookup()]]''. The Xfrm framework performs a lookup into the IPsec SPD, searching for a matching output policy (''dir out'' SP). If no matching policy is found, the network packet stays unchanged and simply continues on its way. If a matching policy is found, a lookup into the SAD is performed to resolve an SA which corresponds to the matching SP (shown as an attached magenta box named //Xfrm lookup state//). If the resolved SA specifies tunnel-mode, then yet another routing lookup is performed, this time for the (future) outer IPv4 packet which will later encapsulate the current packet. The actual packet transformation does not yet happen at this point. Instead, a "bundle" of transformation instructions for this packet is assembled. The term "bundle" stems from the kernel source code and refers to a bunch of struct instances pointing to each other. Among those are the original routing decision of this packet, the SP, the SA((there can be more than one SA being applied to a packet, but that is a less common case)), the routing decision for the future outer IP packet and more. Those are usually assembled around one or several instances of ''[[https://elixir.bootlin.com/linux/v5.10.46/source/include/net/xfrm.h#L925|struct xfrm_dst]]''. The bundle is attached to the network packet (skb), replacing the originally attached routing decision. Function pointers within the bundle ensure, that the packet takes a different path through the remaining part of the kernel network stack. Figure {{ref>xfrm_dst}} shows how a bundle would actually look like. 
 +</WRAP> |
 | <WRAP>{{:linux:xfrm-action-encode.png?nolink |}} This is where packets which shall travel through the VPN tunnel are being encrypted and encapsulated. The Xfrm framework transforms a packet according to the instructions in the "bundle" attached to it. A function pointer within the bundle makes sure, that the packet takes a "detour" into this transformation code after traversing the Netfilter //Postrouting// hook. For IPv4 packets, the entry function which leads the packet on this path is ''[[https://elixir.bootlin.com/linux/v5.10.46/source/net/ipv4/xfrm4_output.c#L31|xfrm4_output()]]''. In case of tunnel-mode this transformation means encapsulating the IP packet into a new outer IP packet and then encapsulating the inner IP packet into ESP protocol and encrypting it and its payload. After completion of the transformation, the xfrm components/instructions are being removed from the "bundle", leaving only the routing decision for the outer IP packet attached to the packet.</WRAP> | | <WRAP>{{:linux:xfrm-action-encode.png?nolink |}} This is where packets which shall travel through the VPN tunnel are being encrypted and encapsulated. The Xfrm framework transforms a packet according to the instructions in the "bundle" attached to it. A function pointer within the bundle makes sure, that the packet takes a "detour" into this transformation code after traversing the Netfilter //Postrouting// hook. For IPv4 packets, the entry function which leads the packet on this path is ''[[https://elixir.bootlin.com/linux/v5.10.46/source/net/ipv4/xfrm4_output.c#L31|xfrm4_output()]]''. In case of tunnel-mode this transformation means encapsulating the IP packet into a new outer IP packet and then encapsulating the inner IP packet into ESP protocol and encrypting it and its payload. After completion of the transformation, the xfrm components/instructions are being removed from the "bundle", leaving only the routing decision for the outer IP packet attached to the packet.</WRAP> |
 | <WRAP>{{:linux:xfrm-action-decode.png?nolink |}} This is where packets which have been received through the VPN tunnel are being decrypted and decapsulated. If an IP packet on the local input path contains an ESP packet, then the Xfrm framework performs a lookup into the SAD (//Xfrm lookup state//), based on the SPI((SPI is an integer value in the unencrypted part of the ESP header)) and the destination IP address. If no matching SA is found, the packet is dropped. If a matching SA is found, the ESP packet is decrypted and decapsulated. In case of tunnel-mode the outer IP packet is decapsulated. The remaining inner IP packet then is re-inserted into the receive path on OSI layer 2. Packets remember which SA has been used on them((in an skb extension named ''sec_path'')). That becomes relevant when they later traverse the //Xfrm lookup in policy// or //Xfrm lookup fwd policy// action.\\ \\ In case of //Nat-traversal// mode, both IKE and ESP packets arrive on the local input path being encapsulated in UDP on port 4500 and the kernel must distinguish between both. This is done based on the so-called //Non-ESP Marker//((4 zero bytes ''0x00000000'' at beginning of UDP payload, defined in [[https://datatracker.ietf.org/doc/html/rfc3948|RFC3948]])). IKE packets are given to a UDP socket where userspace application Strongswan is listening ((There is more to that. Strongswan is required to set a special socket option called ''UDP_ENCAP'' or else it won't receive any IKE packets on port 4500. But that is an implementation detail.)), while ESP packets are decrypted and decapsulated as described above.</WRAP> | | <WRAP>{{:linux:xfrm-action-decode.png?nolink |}} This is where packets which have been received through the VPN tunnel are being decrypted and decapsulated. If an IP packet on the local input path contains an ESP packet, then the Xfrm framework performs a lookup into the SAD (//Xfrm lookup state//), based on the SPI((SPI is an integer value in the unencrypted part of the ESP header)) and the destination IP address. If no matching SA is found, the packet is dropped. If a matching SA is found, the ESP packet is decrypted and decapsulated. In case of tunnel-mode the outer IP packet is decapsulated. The remaining inner IP packet then is re-inserted into the receive path on OSI layer 2. Packets remember which SA has been used on them((in an skb extension named ''sec_path'')). That becomes relevant when they later traverse the //Xfrm lookup in policy// or //Xfrm lookup fwd policy// action.\\ \\ In case of //Nat-traversal// mode, both IKE and ESP packets arrive on the local input path being encapsulated in UDP on port 4500 and the kernel must distinguish between both. This is done based on the so-called //Non-ESP Marker//((4 zero bytes ''0x00000000'' at beginning of UDP payload, defined in [[https://datatracker.ietf.org/doc/html/rfc3948|RFC3948]])). IKE packets are given to a UDP socket where userspace application Strongswan is listening ((There is more to that. Strongswan is required to set a special socket option called ''UDP_ENCAP'' or else it won't receive any IKE packets on port 4500. But that is an implementation detail.)), while ESP packets are decrypted and decapsulated as described above.</WRAP> |
Line 166: Line 165:
 are optional to use and never became the default. The Strongswan documentation calls VPN setups based on those virtual network interfaces [[https://wiki.strongswan.org/projects/strongswan/wiki/RouteBasedVPN|"Route-based VPNs"]]. It seems, essentially two types of virtual interfaces have been introduced in this context over the years: The older ''vti'' interfaces and the newer ''xfrm'' interfaces((Additional kinds of virtual network interfaces exist in this context, like e.g. the ''gre'' interfaces, but they represent an entirely different concept and protocol (''GRE'' protocol) which is e.g. used to build DMVPN setups. That is an advanced topic which works a little different than the "normal" IPsec VPN which I describe here.)). In the remaining part of this article I will describe how the IPsec-based VPN looks like from Netfilter point-of-view in the "normal" case where NO virtual network interfaces are used. are optional to use and never became the default. The Strongswan documentation calls VPN setups based on those virtual network interfaces [[https://wiki.strongswan.org/projects/strongswan/wiki/RouteBasedVPN|"Route-based VPNs"]]. It seems, essentially two types of virtual interfaces have been introduced in this context over the years: The older ''vti'' interfaces and the newer ''xfrm'' interfaces((Additional kinds of virtual network interfaces exist in this context, like e.g. the ''gre'' interfaces, but they represent an entirely different concept and protocol (''GRE'' protocol) which is e.g. used to build DMVPN setups. That is an advanced topic which works a little different than the "normal" IPsec VPN which I describe here.)). In the remaining part of this article I will describe how the IPsec-based VPN looks like from Netfilter point-of-view in the "normal" case where NO virtual network interfaces are used.
  
 +<figure xfrm_dst>
 +{{:linux:xfrm_dst.png?direct&700|}}
 +<caption>Simplified illustration of an Xfrm bundle, attached to a network packet
 +(click to enlarge). In IPsec tunnel-mode, the bundle contains two //routing decisions//,
 +references to IPsec SA and SP and function pointers to lead the packet
 +on the Xfrm encrypt+encapsulate path. Compare it to a normal
 +//routing decision// object, which I described in my
 +[[routing_decisions_in_the_linux_kernel_1_lookup_packet_flow#the_routing_decision_object|article series on routing]].
 +</caption>
 +</figure>
  
 ===== Example Site-to-site VPN ===== ===== Example Site-to-site VPN =====
Line 575: Line 584:
   * [[https://ramirose.wixsite.com/ramirosen|Linux Kernel Networking - Implementation and Theory (Rami Rosen, Apress, 2014)]]   * [[https://ramirose.wixsite.com/ramirosen|Linux Kernel Networking - Implementation and Theory (Rami Rosen, Apress, 2014)]]
  
-//published 2020-05-30//, //last modified 2021-12-12//+//published 2020-05-30//, //last modified 2022-08-14//
  
blog/linux/nftables_ipsec_packet_flow.1643556437.txt.gz · Last modified: 2022-01-30 by Andrej Stender