0

I am trying to figure out where a connection is getting dropped in a complex SDN environment that involves a combination of nftables rules and an OpenVSwitch switch with complex flow rules.

I have a connection originating from 111.222.73.199 (not a real address), targeting (also not a real address) 222.333.61.241. The destination address is accessible through a VLAN interface on the target host:

# ip addr show bond0.2180
9: bond0.2180@bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 10:7d:1a:9c:7c:1d brd ff:ff:ff:ff:ff:ff
    inet 222.333.61.23/24 scope global bond0.2180
       valid_lft forever preferred_lft forever

The default route on that system is not out the public address; the main routing table looks like:

default via 10.30.6.1 dev bond0 proto dhcp src 10.30.6.23 metric 300
10.30.6.0/23 dev bond0 proto kernel scope link src 10.30.6.23 metric 300
10.30.10.0/23 dev bond0.2173 proto kernel scope link src 10.30.10.23 metric 402
10.88.0.0/16 dev cni-podman0 proto kernel scope link src 10.88.0.1 linkdown
10.128.0.0/14 dev tun0 scope link
10.255.116.0/23 via 10.30.10.1 dev bond0.2173 proto dhcp src 10.30.10.23 metric 402
172.30.0.0/16 dev tun0
222.333.61.0/24 dev bond0.2180 proto kernel scope link src 222.333.61.23

We have some policy based rules in place to handle routing for traffic over the public interface:

# ip rule show
0:      from all lookup local
32764:  from 222.333.61.0/24 lookup main suppress_prefixlength 0
32765:  from 222.333.61.0/24 lookup 200
32766:  from all lookup main
32767:  from all lookup default

Where routing table 200 has:

default via 222.333.61.1 dev bond0.2180

With nftrace enabled, we can see that the inbound packet enter the PREROUTING chain in the nat table and gets as far as a dnat rule (this all looks fine):

trace id 7a66a648 ip nat PREROUTING packet: iif "bond0.2180" ether saddr 00:09:0f:09:00:22 ether daddr 10:7d:1a:9c:7c:1d ip saddr 111.222.73.199 ip daddr 222.333.61.241 ip dscp af21 ip ecn not-ect ip ttl 49 ip id 8129 ip length 60 tcp sport 47392 tcp dport 80 tcp flags == syn tcp window 64240
[...]
trace id 7a66a648 ip nat KUBE-SEP-CLHTNA52WCATND65 rule meta l4proto tcp   counter packets 0 bytes 0 dnat to 10.129.4.95:9991 (verdict accept)

Because we entered this rule through the PREROUTING chain, the dnat should result in a route lookup, which gets us:

# ip route get 10.129.4.95
10.129.4.95 dev tun0 src 10.131.2.1 uid 0
    cache

Where tun0 is an OpenVSwitch interface:

# ip -d addr show tun0
14: tun0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 22:10:ac:4b:ca:3c brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 65535
    openvswitch numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535
    inet 10.131.2.1/23 brd 10.131.3.255 scope global tun0
       valid_lft forever preferred_lft forever
    inet6 fe80::2010:acff:fe4b:ca3c/64 scope link
       valid_lft forever preferred_lft forever

Attached to the OVS bridge br0:

# ovs-vsctl show
02f8a53c-c970-419f-9c42-0b0be382638f
    Bridge br0
        fail_mode: secure
        [...]
        Port vxlan0
            Interface vxlan0
                type: vxlan
                options: {dst_port="4789", key=flow, remote_ip=flow}
        [...]
        Port br0
            Interface br0
                type: internal
        [...]
        Port tun0
            Interface tun0
                type: internal
        [...]
    ovs_version: "2.17.3"

I believe that at the point the packet is accepted by the dnat rule, we have:

  • source address: 111.222.73.199:47392
  • destination address: 10.129.4.95:9991

If we plug these values into ovs-appctl ofproto/trace, we get the following:

# ovs-appctl ofproto/trace br0 in_port=tun0,tcp,nw_src=111.222.73.199,nw_dst=10.129.4.95,tcp_src=47392,tcp_dst=9991
Flow: tcp,in_port=2,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,nw_src=111.222.73.199,nw_dst=10.129.4.95,nw_tos=0,nw_ecn=0,nw_ttl=0,tp_src=47392,tp_dst=9991,tcp_flags=0

bridge("br0")
-------------
 0. ct_state=-trk,ip, priority 1000
    ct(table=0)
    drop
     -> A clone of the packet is forked to recirculate. The forked pipeline will be resumed at table 0.
     -> Sets the packet to an untracked state, and clears all the conntrack fields.

Final flow: unchanged
Megaflow: recirc_id=0,ct_state=-trk,eth,ip,in_port=2,nw_frag=no
Datapath actions: ct,recirc(0x428ac)

===============================================================================
recirc(0x428ac) - resume conntrack with default ct_state=trk|new (use --ct-next to customize)
===============================================================================

Flow: recirc_id=0x428ac,ct_state=new|trk,eth,tcp,in_port=2,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,nw_src=111.222.73.199,nw_dst=10.129.4.95,nw_tos=0,nw_ecn=0,nw_ttl=0,tp_src=47392,tp_dst=9991,tcp_flags=0

bridge("br0")
-------------
    thaw
        Resuming from table 0
 0. ip,in_port=2, priority 200
    goto_table:30
30. priority 0
    goto_table:31
31. ip,nw_dst=10.128.0.0/14, priority 100
    goto_table:90
90. ip,nw_dst=10.129.4.0/23, priority 100, cookie 0x1173adfa
    move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31]
     -> NXM_NX_TUN_ID[0..31] is now 0
    set_field:10.30.6.19->tun_dst
    output:1
     -> output to kernel tunnel

Final flow: recirc_id=0x428ac,ct_state=new|trk,eth,tcp,tun_src=0.0.0.0,tun_dst=10.30.6.19,tun_ipv6_src=::,tun_ipv6_dst=::,tun_gbp_id=0,tun_gbp_flags=0,tun_tos=0,tun_ttl=0,tun_erspan_ver=0,gtpu_flags=0,gtpu_msgtype=0,tun_flags=0,in_port=2,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,nw_src=111.222.73.199,nw_dst=10.129.4.95,nw_tos=0,nw_ecn=0,nw_ttl=0,tp_src=47392,tp_dst=9991,tcp_flags=0
Megaflow: recirc_id=0x428ac,ct_state=-rpl+trk,eth,ip,tun_id=0/0xffffffff,tun_dst=0.0.0.0,in_port=2,nw_src=64.0.0.0/2,nw_dst=10.129.4.0/23,nw_ecn=0,nw_frag=no
Datapath actions: set(tunnel(tun_id=0x0,dst=10.30.6.19,ttl=64,tp_dst=4789,flags(df|key))),2

According to the above, the packet should get emitted over vxlan tunnel 0 to host 10.30.6.19...but we never see that traffic on the network.

Additionally, if I enable debug logging for the OVS dpif facility, like this:

ovs-appctl vlog/set file:dpif:dbg

I never see either the source address (111.222.73.199) or the destination address (10.129.4.95) or the destination port (9991) in the logs.

I am looking for any suggestions to help figure out where this connection is going (or even to verify that it is entering OVS as I expect).

0

0

You must log in to answer this question.

Browse other questions tagged .