This is an old revision of the document!
Table of Contents
Nftables - Netfilter and VPN/IPsec packet flow
In this article I like to explain how the packet flow through
Netfilter hooks looks like on a host which works as a VPN gateway
based on IPsec (Strongswan, tunnel-mode, IKEv2, ESP). I'll focus on Nftables
in favor of the older Iptables and regarding Strongswan I'll focus on the newer
vici interface (using swanctl
) in favor of the older stroke interface.
See also my other article which covers packet flow through Netfilter hooks in general.
IPsec recap
A comprehensive recap on the topic IPsec would require a whole book. I'll merely provide a very short recap here to put my actual topic into context.
IPsec implementation in Linux
IPsec implementation in Linux consists of a userspace part and a kernel part. Nowadays the userspace part is represented by the StrongSwan suite (there have been predecessors) and the kernel part is represented by the Xfrm framework, which is sometimes called the Netkey stack and is present in the kernel since v2.6. With the following image I like to show these components and how they interact in a simple block diagram style.
Stongswan
The essential part of Strongswan is the userspace daemon charon which implements IKEv1/IKEv2 and acts as the central “orchestrator” of IPsec-based VPN (the main active component) on the system.
It provides an interface to the user/administrator to configure IPsec on the system.
Actually, more precisely, it provides two different interfaces to do that:
One is the so-called Stroke interface. It provides means to configure IPsec
via two main config files /etc/ipsec.conf
and /etc/ipsec.secrets
.
This is the older of the two interfaces and it can be considered deprecated (however
it is still supported).
The other and newer one is the so-called Vici interface. It is an IPC mechanism,
which means the charon daemon listens on a Unix-domain socket and client tools
(like Strongswans own cmdline tool swanctl
, but also other tools like e.g. the
NHRP daemon of the FRR routing protocol engine, which is used in DMVPN setups)
can connect to it to configure IPsec.
This way of configuration is more powerful than the Stroke interface , because it
makes it easier for other tools to provide and adjust configuration dynamically
and event driven at any time.
However in many common IPsec setups the configuration is still simply being
supplied via a config files. When using Vici, the difference is merely, that
the config file(s) (mainly the file /etc/swanctl/swanctl.conf
) are not interpreted
by the charon daemon directly, but instead are interpreted by the cmdline tool swanctl
which then feeds this config into the charon daemon via the Vici IPC interface.
The charon daemon uses a Netlink socket as a communication channel into the kernel.
The xfrm framework
The so-called xfrm framework is a component within the Linux kernel. While
the userspace part (Strongswan) handles the overall IPsec orchestration and
runs the IKEv1/IKEv2 protocol to buildup/teardown VPNs, the kernel part
handles all what can be considered the “VPN payload”. It implements the
Security Association Database (SAD
) and the Security Policy
Database (SPD
)
This means the userspace
daemon charon feeds the actual IPsec Security Association (SA
)
instances and Security Policy (SP
) instances, which result from
configuration and from IKEv1/IKEv2 handshake into the kernel and the kernel
maintains and uses those to encrypt and decrypt the actual “payload” network
packets of the VPN.
You can use the iproute2 tool ip
as low-level admin tool to show
the SA
s and SP
s which are currently configured in the
databases inside the kernel:
#list SAs which are currently configured in the kernel ip xfrm state #list SPs which are currently configured in the kernel ip xfrm policy
The ip
tool uses the same means (Netlink socket) to communicate with
the kernel. You could also use it as a low-level config tool to
create/edit/delete SA
s and SP
s in the kernel, however in practice
you leave those duties to Strongswan.
hooks
Table
ICMP
echo-request
h1
→ h2
, r1
traversal
step | netfilter hook / xfrm | encapsulation | iif | oif | ip saddr | ip daddr |
---|---|---|---|---|---|---|
1 | prerouting | |eth| |ip|icmp| | eth0 | 10.0.1.100 | 10.0.2.100 |
|
2 | forward | |eth| |ip|icmp| | eth0 | eth1 | 10.0.1.100 | 10.0.2.100 |
3 | postrouting | |eth| |ip|icmp| | eth1 | 10.0.1.100 | 10.0.2.100 |
|
4 | xfrm lookup | |eth| |ip|icmp| | ||||
5 | xfrm encode | |eth|......|ip|icmp| | ||||
6 | output | |eth|ip|esp|ip|icmp| | eth1 | 3.0.0.1 | 5.0.0.1 |
|
7 | postrouting | |eth|ip|esp|ip|icmp| | eth1 | 3.0.0.1 | 5.0.0.1 |
ICMP
echo-reply
h2
→ h1
, r1
traversal
step | netfilter hook / xfrm | encapsulation | iif | oif | ip saddr | ip daddr |
---|---|---|---|---|---|---|
1 | prerouting | |eth|ip|esp|ip|icmp| | eth1 | 5.0.0.1 | 3.0.0.1 |
|
2 | input | |eth|ip|esp|ip|icmp| | eth1 | 5.0.0.1 | 3.0.0.1 |
|
3 | xfrm/socket lookup | |eth|ip|esp|ip|icmp| | ||||
4 | xfrm decode | |eth|......|ip|icmp| | ||||
5 | prerouting | |eth| |ip|icmp| | eth1 | 10.0.2.100 | 10.0.1.100 |
|
6 | forward | |eth| |ip|icmp| | eth1 | eth0 | 10.0.2.100 | 10.0.1.100 |
7 | postrouting | |eth| |ip|icmp| | eth0 | 10.0.2.100 | 10.0.1.100 |
Context
The described behavior and implementation has been observed on a
Debian 10 (buster) system with using Debian backports on amd64
architecture.
- kernel:
5.4.19-1~bpo10+1
- nftables:
0.9.3-2~bpo10+1
- libnftnl:
1.1.5-1~bpo10+1
- strongswan:
5.7.2-1
Feedback
Feedback to this article is very welcome!