Routing VXLAN’s

Now that we have configured a VXLAN (see the previous post: https://latebits.com/2020/01/31/vxlans-on-cumulus/ ), the next step would be to configure another one and also route traffic between them.

For routing VXLANs on Cumulus switches there are 3 options:

  1. Centralized routing: Specific VTEPs act as designated layer 3 gateways and perform routing between subnets; other VTEPs just perform bridging.
  2. Distributed asymmetric routing: Every VTEP participates in routing, but all routing is done at the ingress VTEP; the egress VTEP only performs bridging.
  3. Distributed symmetric routing: Every VTEP participates in routing and routing is done at both the ingress VTEP and the egress VTEP.

For my topology, I’ve chosen Distributed asymmetric routing.
In distributed asymmetric routing, each VTEP acts as a layer 3 gateway, performing routing for its attached hosts. The routing is called asymmetric because only the ingress VTEP performs routing, the egress VTEP only performs the bridging. Asymmetric routing is easy to deploy as it can be achieved with only host routing and does not involve any interconnecting VNIs. However, each VTEP must be provisioned with all VLANs/VNIs – the subnets between which communication can take place; this is required even if there are no locally-attached hosts for a particular VLAN. ” (source: https://docs.cumulusnetworks.com/version/cumulus-linux-37/Network-Virtualization/Ethernet-Virtual-Private-Network-EVPN/ )

Considering that we already have one VLAN and VXLAN configured in the previous post, we need to add another VLAN and VXLAN. The SVI for each VLAN needs to be configured with an anycast IP/MAC address. As we have configured VRR for VLAN 100 on switches Cumulus 1 and 2, we only have to do it for VLAN 100 on switch 5.
For VLAN 200, as this is a new VLAN, we need to add the anycast IP/MAC on all 3 switches.

For example, VLAN 100 needs to have this on switches Cumulus 1,2 and 5 (switches with VTEPs):
net add vlan 100 ip address-virtual 00:00:5e:00:01:00 192.168.1.254/24

net commands
============

switch1# net add vlan 200 ip address 192.168.2.252/24
switch1# net add vlan 200 ip address-virtual 00:00:5e:00:01:02 192.168.2.254/24

switch2# net add vlan 200 ip address 192.168.2.253/24
switch2# net add vlan 200 ip address-virtual 00:00:5e:00:01:02 192.168.2.254/24

switch3# net add interface swp6 bridge access 100
switch3# net add interface swp5 bridge access 200

switch5# net add interface swp6 bridge access 100
switch5# net add interface swp5 bridge access 200
switch5# net add vlan 200 ip address 192.168.2.251/24
switch5# net add vlan 100 ip address-virtual 00:00:5e:00:01:00 192.168.1.254/24
switch5# net add vlan 200 ip address-virtual 00:00:5e:00:01:02 192.168.2.254/24

Next, we need to create the VXLAN for VLAN 200 on all switches with VTEPs:

switch1# net add vxlan vni200 vxlan id 200
switch1# net add vxlan vni200 bridge access 200
switch1# net add vxlan vni200 bridge learning off
switch1# net add vxlan vni200 stp bpduguard
switch1# net add vxlan vni200 stp portbpdufilter
switch1# net add vxlan vni200 vxlan local-tunnelip 10.1.1.1
switch1# net add vxlan vni200 bridge arp-nd-suppress on

switch2# net add vxlan vni200 vxlan id 200
switch2# net add vxlan vni200 bridge access 200
switch2# net add vxlan vni200 bridge learning off
switch2# net add vxlan vni200 stp bpduguard
switch2# net add vxlan vni200 stp portbpdufilter
switch2# net add vxlan vni200 vxlan local-tunnelip 10.1.1.2
switch2# net add vxlan vni200 bridge arp-nd-suppress on

switch5# net add vxlan vni200 vxlan id 200
switch5# net add vxlan vni200 bridge access 200
switch5# net add vxlan vni200 bridge learning off
switch5# net add vxlan vni200 stp bpduguard
switch5# net add vxlan vni200 stp portbpdufilter
switch5# net add vxlan vni200 vxlan local-tunnelip 10.1.1.5
switch5# net add vxlan vni200 bridge arp-nd-suppress on

Verification
============
net show evpn vni
net show bgp l2vpn evpn vni 100
net show bgp l2vpn evpn route
net show evpn mac vni 100
net show evpn mac vni 200
net show evpn arp-cache vni 100
net show evpn arp-cache vni 200

And that’s it. The local switch where VXLANs terminate will now be able to route between the VXLANs. That is because all these switches that have VTEPs, have all VXLANs configured and they maintain a mac table and arp table, so they know which VXLAN each IP is, or if it is local to the switch or remote. So the only thing they have to do is route locally to the destination VXLAN, and send traffic via this VXLAN and the remote switch will only bridge the traffic to the destination.

Checking the arp table for each vni, you can see which the learned ip’s and mac’s are local and which are remote (learned from another VTEP).
In the example below we are looking at Cumulus 1 switch:

cumulus@cumulus:~$ net show evpn arp-cache vni 100
Number of ARPs (local and remote) known for this VNI: 6
IP                       Type   State    MAC               Remote VTEP           Seq #'s
fe80::200:5eff:fe00:100  local  active   00:00:5e:00:01:00                       0/0
192.168.1.20             remote active   00:50:79:66:68:01 10.1.1.5              0/0
192.168.1.252            local  active   0c:ce:df:31:43:06                       0/0
fe80::ece:dfff:fe31:4306 local  active   0c:ce:df:31:43:06                       0/0
192.168.1.254            local  active   00:00:5e:00:01:00                       0/0
192.168.1.10             local  active   00:50:79:66:68:00                       0/0
cumulus@cumulus:~$ net show evpn arp-cache vni 200
Number of ARPs (local and remote) known for this VNI: 6
IP                       Type   State    MAC               Remote VTEP           Seq #'s
fe80::ece:dfff:fe31:4306 local  active   0c:ce:df:31:43:06                       0/0
192.168.2.254            local  active   00:00:5e:00:01:02                       0/0
192.168.2.10             local  active   00:50:79:66:68:02                       0/0
192.168.2.20             remote active   00:50:79:66:68:03 10.1.1.5              0/0
192.168.2.252            local  active   0c:ce:df:31:43:06                       0/0
fe80::200:5eff:fe00:102  local  active   00:00:5e:00:01:02                       0/0

For example, you can see that 192.168.2.10 is local to this switch and 192.168.2.20 is remote (learned from VTEP 10.1.1.5 , which is Cumulus 5 switch).
Now let’s simulate a migration of these VMs and let’s move 192.168.2.10 (PC3) to switch Cumulus 5 and 192.168.2.20(PC4) to Cumulus 3. You will see that the IPs have changed places. Now, 192.168.2.10 is remote and 192.168.2.20 is local.

cumulus@cumulus:~$ net show evpn arp-cache vni 200
Number of ARPs (local and remote) known for this VNI: 6
IP                       Type   State    MAC               Remote VTEP           Seq #'s
fe80::ece:dfff:fe31:4306 local  active   0c:ce:df:31:43:06                       0/0
192.168.2.254            local  active   00:00:5e:00:01:02                       0/0
192.168.2.10             remote active   00:50:79:66:68:02 10.1.1.5              0/1
192.168.2.20             local  active   00:50:79:66:68:03                       0/0
192.168.2.252            local  active   0c:ce:df:31:43:06                       0/0
fe80::200:5eff:fe00:102  local  active   00:00:5e:00:01:02                       0/0

Documentation:
https://community.mellanox.com/s/article/evpn-symmetric-routing-with-mellanox-switches

About the author

Mihai is a Senior Network Engineer with more than 15 years of experience