BGP ,Cilium, and FRR: Top of Rack For All!

BGP ,Cilium, and FRR: Top of Rack For All!

I recently came across a LinkedIn post talking about the above concepts and its trivialness to setup. The goal: Use Cilium's BGP capabilities to either expose a service or export the pod cidr and advertise its range to a peer. We are all on different chapters of our life's book, so I wanted to explain the setup a little more in order to possibly help someone out there add a feather to their hat!

Why would you want to expose a pod network directly using BGP? The concepts are relatively simple. The idea of ToR or top of rack is the idea that in the data center a rack has multiple servers in it and at the top is a switch that they all connect to. Then off to an aggregate it goes. In this scenario, we have no load balancers in between as Kubernetes is keen to do, nor do we need to expose node ports. Just straight connections via advertised routes directly to the applications. Why set this up at home? Its likely that any services you may be running are one-offs serving things like Plex, Pihole, etc. This makes it incredibly easy to connect to the applications directly.

The Setup

In my setup I will be using FRR on a UDM-SE and a Raspberry Pi running K3S and Cilium. Feel free to use a standalone nix box to setup FRR but know that you will also need to add some static routes to it. For my UDM, the FRR package was already installed! Under the hood, Ubiquiti uses it for its Magic VPN feature. I don't use it, so it was straight forward to enable the systemd service with my own custom configuration I will show below. For more details, Chris's blog here can show you everything you need to do.

hostname UDM-SE
frr defaults datacenter
log file stdout
service integrated-vtysh-config
!
!
router bgp 65001
 bgp router-id 192.168.120.254
 neighbor 192.168.120.11 remote-as 65000 #raspberry pi
 neighbor 192.168.120.11 default-originate #raspberry pi
 !
 address-family ipv4 unicast
  redistribute connected
  redistribute kernel
  neighbor V4 soft-reconfiguration inbound
  neighbor V4 route-map ALLOW-ALL in
  neighbor V4 route-map ALLOW-ALL out
 exit-address-family
 !
route-map ALLOW-ALL permit 10
!
line vty
!

The above is my configuration for FRR. Straight forward besides my comments about where the raspberry pi single node k3s lives.

Now its on to Cilium and K3s. Note that you will need to disable flannel, servicelb, and network-policy. You can do this with a fresh install, setting environment variables, or by editing the systemd service. If you are running this on an existing installation, you will likely also need to remove the flannel vxlan. Run ip link show to verify its presence if you encounter a crash loop with Cilium.

ExecStart=/usr/local/bin/k3s \
    server \
        '--flannel-backend=none' \
        '--disable-network-policy' \
        '--disable=servicelb' \

Installing cilium with the binary and single flag for this use case is as follows cilium install --set bgpControlPlane.enabled=true . After a successful installation, its time to create the CiliumBGPPeeringPolicy . Below is my example with notes.

apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeeringPolicy
metadata:
  name: custom-policy
spec:
  virtualRouters:
  - exportPodCIDR: true # allows the pod CIDR to be advertised
    localASN: 65000
    neighbors:
    - connectRetryTimeSeconds: 120
      eBGPMultihopTTL: 1
      holdTimeSeconds: 90
      keepAliveTimeSeconds: 30
      peerASN: 65001 #FRR ASN
      peerAddress: 192.168.120.1/32 # FRR address
      peerPort: 179

Validation

On to validation. From the FRR side I run vtysh -c 'show ip bgp' and receive

   Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.0.0/24      192.168.120.11

From where my Cilium binary is installed with access to my K3s cluster I run cilium bgp peers and receive

Node     Local AS   Peer AS   Peer Address    Session State   Uptime     Family         Received   Advertised
pi       65000      65001     192.168.120.1   established     12h49m3s   ipv4/unicast   7          1
                                                                         ipv6/unicast   0          0

From here, if you had flannel installed previously you will likely need to restart the pods in order to get new Cilium CNI range. Run a curl or whatever you see fit and verify its working!

 $curl http://10.0.0.83:8080
  Hello World!