Selecting multiple paths enables BGP to load-balance traffic across multiple links. Equal-cost multipath (ECMP) is a network routing strategy that allows for traffic of the same session, or flow—that is, traffic with the same source and destination—to be transmitted across multiple paths of equal cost.
We’ll show the way we have activated ECMP from MetalLB LoadBalancer service for some applications and also what we have set up from SRL leaf routers to make it work. The lab setup is described in “CALICO AND METALLB WORKING TOGETHER WITH BGP”
Next, you have a picture with the topology of the lab: MetalLB and ECMP with SRLinux

Nokia SRLinux ECMP settings
Those are the settings I am using in the srlinux leaf router connected to all k8s nodes:
--{ + running }--[ network-instance ip-vrf1 protocols ]--
A:leaf2# info
bgp-evpn {
bgp-instance 1 {
admin-state enable
vxlan-interface vxlan1.4
evi 4
ecmp 4
}
}
bgp {
admin-state enable
autonomous-system 65320
router-id 6.5.3.2
dynamic-neighbors {
accept {
match 6.4.5.0/26 {
peer-group metallb-bgp
allowed-peer-as [
65201
]
}
match 192.168.101.0/24 {
peer-group calico-bgp
allowed-peer-as [
64512
]
}
}
}
ebgp-default-policy {
import-reject-all false
export-reject-all false
}
group calico-bgp {
admin-state enable
export-policy export-calico
import-policy import-all
timers {
minimum-advertisement-interval 1
}
transport {
local-address 6.5.3.2
}
}
group metallb-bgp {
admin-state enable
export-policy export-all
import-policy import-all
timers {
minimum-advertisement-interval 1
}
transport {
local-address 6.5.3.2
}
}
ipv4-unicast {
multipath {
allow-multiple-as true
max-paths-level-1 64
max-paths-level-2 64
}
}
}
bgp-vpn {
bgp-instance 1 {
route-target {
export-rt target:65123:4
import-rt target:65123:4
}
}
}
As you can see, k8s nodes are connected via a EVPN Layer2 domain. In order to use ECMP, you have to activate multipath in BGP as follow
ipv4-unicast {
multipath {
allow-multiple-as true
max-paths-level-1 64
max-paths-level-2 64
In this case, for MetalLB and ECMP with SRLinux, you can use a value under 64. In this case, I am using the max value for testing purposes.
Don’t forget to enable ecmp in bgp-evpn.
Kubernetes MetalLB service settings
Since we are using BGP with ECMP, you have to skip kube-proxy. When announcing over BGP, MetalLB respects the service’s externalTrafficPolicy option, and implements two different announcement modes depending on what policy you select. If you’re familiar with Google Cloud’s Kubernetes load balancers, then you know what we are talking here: MetalLB’s behaviors and tradeoffs are identical.
“Local” traffic policy
With the Local traffic policy, nodes will only attract traffic if they are running one or more of the service’s pods locally. The BGP routers will load balance incoming traffic only across those nodes that are currently hosting the service. On each node, the traffic is forwarded only to local pods by kube-proxy, there is no “horizontal” traffic flow between nodes.
Based on this info. we’ll define the following LoadBalancer service for ingress controller we defined in our last post: Ingress and MetalLB
apiVersion: v1
kind: Service
metadata:
name: ingress-ctl-lb
annotations:
externalTrafficPolicy: local
namespace: ingress-nginx
spec:
type: LoadBalancer
ports:
- name: http
port: 80
targetPort: 80
selector:
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/component: controller
Once this service is created, you will defined a service like this
[root@ctl-a1 ~]# kubectl get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-ctl-lb LoadBalancer 10.111.14.140 10.254.254.241 80:31541/TCP 21h
ingress-nginx-controller NodePort 10.96.20.11 <none> 80:31053/TCP,443:31764/TCP 22h
ingress-nginx-controller-admission ClusterIP 10.100.46.67 <none> 443/TCP 13d
Final results
From the leaf routers ‘leaf1’ you will se the following regarding the received routes form k8s nodes and other leaf nodes
A:leaf1# /show network-instance ip-vrf1 protocols bgp routes ipv4 prefix 10.254.254.241/32 detail
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Show report for the BGP routes to network "10.254.254.241/32" network-instance "ip-vrf1"
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Network: 10.254.254.241/32
Received Paths: 4
Path 1: <Best,Valid,Used,>
Route source : neighbor 0.0.0.0
Route Preference: MED is -, LocalPref is 100
BGP next-hop : 0.0.0.0
Path : ?
Communities : None
RR Attributes : No Originator-ID, Cluster-List is [ - ]
Aggregation : Not an aggregate route
Unknown Attr : None
Invalid Reason : None
Tie Break Reason: none
Path 2: <Valid,>
Route source : neighbor 6.4.5.20
Route Preference: MED is -, LocalPref is 100
BGP next-hop : 6.4.5.20
Path : ? [65201]
Communities : None
RR Attributes : No Originator-ID, Cluster-List is [ - ]
Aggregation : Not an aggregate route
Unknown Attr : None
Invalid Reason : None
Tie Break Reason: as-path-length
Path 3: <Valid,>
Route source : neighbor 6.4.5.22
Route Preference: MED is -, LocalPref is 100
BGP next-hop : 6.4.5.22
Path : ? [65201]
Communities : None
RR Attributes : No Originator-ID, Cluster-List is [ - ]
Aggregation : Not an aggregate route
Unknown Attr : None
Invalid Reason : None
Tie Break Reason: peer-router-id
Path 4: <Valid,>
Route source : neighbor 6.4.5.31
Route Preference: MED is -, LocalPref is 100
BGP next-hop : 6.4.5.31
Path : ? [65201]
Communities : None
RR Attributes : No Originator-ID, Cluster-List is [ - ]
Aggregation : Not an aggregate route
Unknown Attr : None
Invalid Reason : None
Tie Break Reason: peer-router-id
Path 1 was advertised to:
[ 6.4.5.20, 6.4.5.22, 6.4.5.31 ]
Route Preference: MED is -, LocalPref is 100
Path : ? [65310]
Communities : None
RR Attributes : No Originator-ID, Cluster-List is [ - ]
Aggregation : Not an aggregate route
Unknown Attr : None
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--{ + running }--[ ]--
From the leaf routers ‘leaf2’ you will se the following regarding the received routes form k8s nodes and other leaf nodes
A:leaf2# /show network-instance ip-vrf1 protocols bgp routes ipv4 prefix 10.254.254.241/32 detail
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Show report for the BGP routes to network "10.254.254.241/32" network-instance "ip-vrf1"
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Network: 10.254.254.241/32
Received Paths: 3
Path 1: <Best,Valid,Used,>
Route source : neighbor 6.4.5.21
Route Preference: MED is -, LocalPref is 100
BGP next-hop : 6.4.5.21
Path : ? [65201]
Communities : None
RR Attributes : No Originator-ID, Cluster-List is [ - ]
Aggregation : Not an aggregate route
Unknown Attr : None
Invalid Reason : None
Tie Break Reason: none
Path 2: <Best,Valid,Used,>
Route source : neighbor 6.4.5.30
Route Preference: MED is -, LocalPref is 100
BGP next-hop : 6.4.5.30
Path : ? [65201]
Communities : None
RR Attributes : No Originator-ID, Cluster-List is [ - ]
Aggregation : Not an aggregate route
Unknown Attr : None
Invalid Reason : None
Tie Break Reason: peer-router-id
Path 3: <Best,Valid,Used,>
Route source : neighbor 6.4.5.32
Route Preference: MED is -, LocalPref is 100
BGP next-hop : 6.4.5.32
Path : ? [65201]
Communities : None
RR Attributes : No Originator-ID, Cluster-List is [ - ]
Aggregation : Not an aggregate route
Unknown Attr : None
Invalid Reason : None
Tie Break Reason: peer-router-id
Path 3 was advertised to:
[ 6.4.5.30, 6.4.5.32 ]
Route Preference: MED is -, LocalPref is 100
Path : ? [65320, 65201]
Communities : None
RR Attributes : No Originator-ID, Cluster-List is [ - ]
Aggregation : Not an aggregate route
Unknown Attr : None
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--{ + running }--[ ]--
then, the route in leaf1 for this prefix will be
A:leaf1# /show network-instance ip-vrf1 route-table ipv4-unicast prefix 10.254.254.241/32 detail
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
IPv4 Unicast route table of network instance ip-vrf1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Destination : 10.254.254.241/32
ID : 0
Route Type : bgp-evpn
Route Owner : bgp_evpn_mgr
Metric : 0
Preference : 170
Best : true
Last change : 2021-09-28T18:05:23.902Z
Resilient hash: false
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Next hops: 1 entries
1.1.1.2 (indirect) resolved by 1.1.1.2/32 (vxlan)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
And the routes in the leaf2 for the same prefix
A:leaf2# /show network-instance ip-vrf1 route-table ipv4-unicast prefix 10.254.254.241/32 detail
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
IPv4 Unicast route table of network instance ip-vrf1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Destination : 10.254.254.241/32
ID : 0
Route Type : bgp
Route Owner : bgp_mgr
Metric : 0
Preference : 170
Best : true
Last change : 2021-09-28T18:05:21.922Z
Resilient hash: false
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Next hops: 3 entries
6.4.5.21 (indirect) resolved by 6.4.5.21/32 (static)
via 192.168.101.21 (indirect) resolved by 192.168.101.0/24 (local)
via 192.168.101.1 (direct) via [irb0.0]
6.4.5.30 (indirect) resolved by 6.4.5.30/32 (static)
via 192.168.101.30 (indirect) resolved by 192.168.101.0/24 (local)
via 192.168.101.1 (direct) via [irb0.0]
6.4.5.32 (indirect) resolved by 6.4.5.32/32 (static)
via 192.168.101.32 (indirect) resolved by 192.168.101.0/24 (local)
via 192.168.101.1 (direct) via [irb0.0]
As you can see, we have multiple routes to different nodes in the Kubernetes Cluster. In the case of ‘leaf1’, is taking the route to ‘leaf2’ as valid, and leaving ‘leaf2’ to manage all the routes to different k8s nodes.
See ya and don’t forget to comment