Where and how do you even begin implementing BGP in an established production environment? First of all, I’ve been brainwashed by Ivan Pepelnjak’s blog into using BGP. If you search through his posts relating to BGP, it’s a clear choice (amongst competing protocols like OSPF or IS-IS) for scalability and flexibility for any data center where scalability is a concern.
So after reading Ivan’s relevant blog posts, assuming you are convinced that BGP is the way to go, how do you convince your colleagues, who are working in the same enterprise environment as yourself, and have little to no experience in implementing BGP in production networks, to consider using it in an ever-expanding DC network? My solution was to build a close replica of our routing environment in GNS3, with 24 routers, with even higher scale (more interfaces, larger routing table, etc.) than what we see in our production environment so far.
Download this GNS3 Topology here (also see full config at the bottom of this post)
As you saw in my last blog post, I am a HUGE fan of GNS3. Similar to last post, I used GNS3 again this time for proof of concept. This topology however is on a significantly larger scale (24 routers vs 13) and with significantly larger number of routes (~50 vs ~1000), with 900 loopbacks used to represent client networks in different VRFs, to replicate routing between three Data Centers. DC1 and DC2 are paired for high availability while DC3 is meant for staging and development.
To gain complete picture of the routing between the three DCs, they are even connected via a MPLS cloud simulating a MPLS service provider. As noted above, the main purpose of this particular GNS3 topology was to show how BGP can be utilized to exchange routes between DCs, both directly via a Data Center Interconnect (DCI) solution and over an MPLS cloud using your favourite MPLS service provider(s).
This topology serves as a proof of concept (PoC) and once BGP is implemented in production environment, this topology can continue to serve as a worry free test bed for testing and future routing related PoCs.
Here is a traceroute from DC2-AGG-A (top-left) to a loopback address on DC3-CORE-A (top-right) going through the MPLS cloud:
DC2-AGG-A#traceroute vrf 22-OZ 184.108.40.206 1 220.127.116.11 56 msec 28 msec 20 msec 2 18.104.22.168 96 msec 76 msec 68 msec 3 22.214.171.124 96 msec 136 msec 96 msec 4 172.16.21.1 [MPLS: Labels 27/205 Exp 0] 112 msec 120 msec 96 msec 5 172.16.16.2 [MPLS: Labels 29/205 Exp 0] 76 msec 128 msec 112 msec 6 126.96.36.199 [MPLS: Label 205 Exp 0] 96 msec 84 msec 72 msec 7 188.8.131.52 84 msec 92 msec 116 msec 8 184.108.40.206 144 msec 144 msec 116 msec 9 220.127.116.11 148 msec 128 msec 140 msec
This traceroute shows traffic in between DC1 and DC2 over the DCI:
DC1-CORE-A#traceroute vrf 2-MZ 18.104.22.168 1 22.214.171.124 100 msec 120 msec 84 msec 2 126.96.36.199 84 msec 68 msec 116 msec 3 188.8.131.52 164 msec 164 msec 152 msec 4 184.108.40.206 156 msec 276 msec 200 msec
And this traceroute shows traffic in between DC1 and DC2 over the MPLS cloud when DCI links between them are shut down:
DC1-CORE-A#traceroute vrf 2-MZ 220.127.116.11 1 18.104.22.168 104 msec 52 msec 4 msec 2 22.214.171.124 40 msec 68 msec 84 msec 3 172.16.11.1 [MPLS: Labels 21/37 Exp 0] 112 msec 120 msec 80 msec 4 126.96.36.199 [MPLS: Label 37 Exp 0] 104 msec 84 msec 68 msec 5 188.8.131.52 80 msec 88 msec 104 msec 6 184.108.40.206 152 msec 140 msec 172 msec 7 220.127.116.11 136 msec 156 msec 176 msec
Next steps? Manipulating BGP metrics to customize and route traffic as per requirements – with certain subnets over DCI and other subnets over the MPLS cloud, block certain subnets altogether, and other similar testing/customization.
Also, in this topology, I utilized the same IOS image as in the previous topology, however, I reduced all the devices’ RAM from the default 256 mb to 128 mb. This allowed me to run this massive 24-router topology with under 4 gb of RAM utilized on my host PC. I would however recommend a relatively recent i5 or i7 (or equivalent) processor as CPU usage fluctuates quite a bit, and can be on the high side at times.
If you found this post useful or informative, please leave me a comment! 🙂
Update (2016-11-21) – Adding config below for easy access to individual router configs: