0

we've got a very specific requirement which I want to solve with Open vSwitch. It already works somehow - can you show me what I'm missing here?

Requirement: a Docker container connected to a mac-vlan interface exposes services on a specific port (needs to broadcast on the local network). We need to have the services available on a different port - and there is no way to configure the port on which the service is run. We already tried different approaches (reverse proxy, docker --ports directives, etc.) which didn't work for various reasons, mostly because we still have to stick to the IP of the mac-vlan interface.

That base setup is rather fix, my primary goal is to get it working that way, I think it should be possible.

Environment: Arch Linux with Kernel core/linux 5.10.9 and packages community/openvswitch 2.14.1-1, community/docker 1:20.10.2-4

Enter Open vSwitch: we created the mac-vlan interface on an OVS Bridge and want to use OpenFlow directives to change the port.

# ovs-vsctl show
... output omitted
    Bridge br1
        Port br1.200
            tag: 200
            Interface br1.200      <<< our container is connected here
                type: internal
        Port br1
            Interface br1
                type: internal
        Port patch-br0
            Interface patch-br0    <<< uplink to OVS bridge with physical interface
                type: patch
                options: {peer=patch-br1}

Using Nginx for demonstration, should work with any container...

# docker network create -d macvlan --subnet=172.16.0.0/20 --ip-range=172.16.13.0/29 --gateway=172.16.0.1 -o parent=br1.200 mv.200
# docker run -d --name web --network mv.200 nginx

So far so clear, curl http://172.16.13.0 (which is container web in this case) returns the 'Welcome to nginx!' default page. Now we're trying the following OpenFlow configurations to make container service accessible on port 9080.

Variant 1:

# ovs-ofctl dump-flows br1
 cookie=0x0, duration=1647.225s, table=0, n_packets=16, n_bytes=1435, priority=50,ct_state=-trk,tcp,nw_dst=172.16.13.0,tp_dst=9080 actions=ct(table=0)
 cookie=0x0, duration=1647.223s, table=0, n_packets=3, n_bytes=234, priority=50,ct_state=+new+trk,tcp,nw_dst=172.16.13.0,tp_dst=9080 actions=ct(commit,nat(dst=172.16.13.0:80)),NORMAL
 cookie=0x0, duration=1647.221s, table=0, n_packets=11, n_bytes=956, priority=50,ct_state=+est+trk,tcp,nw_dst=172.16.13.0,tp_dst=9080 actions=ct(nat),NORMAL
 cookie=0x0, duration=1647.219s, table=0, n_packets=0, n_bytes=0, priority=50,ct_state=-trk,tcp,nw_src=172.16.13.0,tp_src=80 actions=ct(table=0)
 cookie=0x0, duration=1647.217s, table=0, n_packets=12, n_bytes=2514, priority=50,ct_state=+trk,tcp,nw_src=172.16.13.0,tp_src=80 actions=ct(nat),NORMAL
 cookie=0x0, duration=84061.461s, table=0, n_packets=309364, n_bytes=36251324, priority=0 actions=NORMAL

Outcome variant 1:

Now a curl http://172.16.13.0:9080 only works if there is already an active flow, but it breaks for the first attempt (tcpdump -i br1.200 on server).

Client > Server : 172.16.1.51:46056 > 172.16.13.0:80 SYN
Server > Client : 172.16.13.0:80 > 172.16.1.51:46056 SYN ACK
Client > Server : 172.16.1.51:46056 > 172.16.13.0:9080 ACK      (destination port not translated)
Server > Client : 172.16.13.0:9080 > 172.16.1.51:46056 RST      (unknown to server)
Server > Client : 172.16.13.0:80 > 172.16.1.51:46056 SYN ACK
Client > Server : 172.16.1.51:46056 > 172.16.13.0:80 RST        (already ACK'ed)

Client > Server : 172.16.1.51:46058 > 172.16.13.0:80 SYN        (second curl)
Server > Client : 172.16.13.0:80 > 172.16.1.51:46058 SYN ACK
Client > Server : 172.16.1.51:46058 > 172.16.13.0:80 ACK        (now with correct port 80)
... (normal TCP connection from here)

Packet #3 should be covered by flow #3, apparently it is not working the way I thought.

# ovs-appctl dpctl/dump-conntrack | grep 172.16.13.0
tcp,orig=(src=172.16.1.51,dst=172.16.13.0,sport=46056,dport=9080),reply=(src=172.16.13.0,dst=172.16.1.51,sport=80,dport=46056),protoinfo=(state=CLOSING)
tcp,orig=(src=172.16.1.51,dst=172.16.13.0,sport=46058,dport=9080),reply=(src=172.16.13.0,dst=172.16.1.51,sport=80,dport=46058),protoinfo=(state=TIME_WAIT)

Can you help me understand why the ct(nat) action for the +trk+est flow is not working for the first connection (but then for the second one)?

Variant 2: (add mod_tp_dst to flow #2)

# ovs-ofctl dump-flows br1
 cookie=0x0, duration=6182.935s, table=0, n_packets=0, n_bytes=0, priority=50,ct_state=-trk,tcp,nw_dst=172.16.13.0,tp_dst=9080 actions=ct(table=0)
 cookie=0x0, duration=6182.931s, table=0, n_packets=0, n_bytes=0, priority=50,ct_state=+new+trk,tcp,nw_dst=172.16.13.0,tp_dst=9080 actions=mod_tp_dst:80,ct(commit,nat(dst=172.16.13.0:80)),NORMAL
 cookie=0x0, duration=6182.928s, table=0, n_packets=0, n_bytes=0, priority=50,ct_state=+est+trk,tcp,nw_dst=172.16.13.0,tp_dst=9080 actions=ct(nat),NORMAL
 cookie=0x0, duration=6182.925s, table=0, n_packets=0, n_bytes=0, priority=50,ct_state=-trk,tcp,nw_src=172.16.13.0,tp_src=80 actions=ct(table=0)
 cookie=0x0, duration=6182.923s, table=0, n_packets=0, n_bytes=0, priority=50,ct_state=+trk,tcp,nw_src=172.16.13.0,tp_src=80 actions=ct(nat),NORMAL
 cookie=0x0, duration=81462.938s, table=0, n_packets=302990, n_bytes=35637543, priority=0 actions=NORMAL

Outcome variant 2:

Running curl http://172.16.13.0:9080 the situation is improved a bit over variant 1 (tcpdump -i eth0 on client).

Client > Server : 172.16.1.51:45974 > 172.16.13.0:9080 SYN
Server > Client : 172.16.13.0:80 > 172.16.1.51:45974 SYN ACK     (response source port not translated)
Client > Server : 172.16.1.51:45974 > 172.16.13.0:80 RST         (unknown to client)
Client > Server : 172.16.1.51:45974 > 172.16.13.0:9080 SYN       (retransmission)
Server > Client : 172.16.13.0:9080 > 172.16.1.51:45974 SYN ACK   (now with correct port 9080)
Client > Server : 172.16.1.51:45974 > 172.16.13.0:9080 ACK

That way the connection always works, but it also add the SYN retransmission timeout to the session setup delay.

# ovs-appctl dpctl/dump-conntrack | grep 172.16.13.0
tcp,orig=(src=172.16.1.51,dst=172.16.13.0,sport=45974,dport=80),reply=(src=172.16.13.0,dst=172.16.1.51,sport=80,dport=45974),protoinfo=(state=SYN_SENT)
tcp,orig=(src=172.16.1.51,dst=172.16.13.0,sport=45974,dport=9080),reply=(src=172.16.13.0,dst=172.16.1.51,sport=80,dport=1355),protoinfo=(state=TIME_WAIT)

Can you help me understand why the first SYN ACK is received untranslated? Flow #5 and ct_state=+trk and actions=ct(nat) should have covered that.

Thanks for reading this long post. I'm thankful for any hints!

1 Answer 1

0

Found a way to get it working, still not sure why variant 1 was not though...

Key to this seems to be the fact that flow #3 in variant 1 didn't pick up correctly or conntrack didn't have the NAT information available.

Here is the flows dump that works for me:

ovs-ofctl dump-flows br1
 cookie=0x0, duration=160.870s, table=0, n_packets=14, n_bytes=1156, priority=50,tcp,nw_dst=172.16.13.0,tp_dst=9080 actions=ct(table=1)
 cookie=0x0, duration=160.867s, table=0, n_packets=11, n_bytes=2430, priority=50,tcp,nw_src=172.16.13.0,tp_src=80 actions=ct(table=1)
 cookie=0x0, duration=184012.978s, table=0, n_packets=558802, n_bytes=60818179, priority=0 actions=NORMAL
 cookie=0x0, duration=160.865s, table=1, n_packets=2, n_bytes=156, priority=50,ct_state=+new,tcp,nw_dst=172.16.13.0,tp_dst=9080 actions=ct(commit,nat(dst=172.16.13.0:80)),NORMAL
 cookie=0x0, duration=160.862s, table=1, n_packets=12, n_bytes=1000, priority=50,tcp,nw_dst=172.16.13.0,tp_dst=9080 actions=mod_tp_dst:80,NORMAL
 cookie=0x0, duration=160.860s, table=1, n_packets=11, n_bytes=2430, priority=50,tcp,nw_src=172.16.13.0,tp_src=80 actions=ct(nat),NORMAL

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .