How to build a Distributed Service Delivery Network


The internet has inherit problems in its physical connectivity and routing topologies a wide range of issues may occur that impact global availability of a service. Some examples include denial of service attacks that effect a segment of the internet or routing issues within a peer internet service provider. In any case the effect simple a percentage of computers are unable to reach the service destination. It is possible to solve this problem by constructing a DSDN (Distributed Service Delivery Network).

Here is an example for building a Distributed Service Delivery Network where the majority  of the users are in North America using datacenter service providers for both co-location and virtual machines on managed shared infrastructure.

Peering Relationships

When selecting a facility provider it is very important to evaluate the internet peering relationships that exist there the their distance to the core of the internet topology, CAIDA has done research in this area and is a great resource as shown below.

It is worth noting the following Autonomous System (AS) Numbers are dominant networks on the internet and you will want to measure the distance to these networks:

  • AS3356 Level 3 Communications
  • AS6939 Hurricane Electric
  • AS3549 Global Crossing Ltd
  • AS6461 Metromedia Fiber net
  • AS3257 Tinet SpA
  • AS1239 Sprint
  • AS2914 NTT America, Inc
  • AS174 Cogent/PSI
  • AS1299 TeliaNet Global Network
  • AS7018 AT&T Service Inc

The must have connections are AS3356, AS6939, AS174 and AS7018 and I would recommend they be first peer connections. There are multiple ways to measure these relationships such as trace routes or by route view looking glass services, I prefer to use BGPlay.

Two Datacenter Topology

It’s possible to achieve good IP Availability between two datacenters sites the common challenge in any design is ensuring consistent connectivity between datacenters. A dedicated Multiprotocol Label Switching (MPLS) private routing network with packet prioritization will meet this requirement. It is very important to select a single backbone carrier for your MPLS network when possible as such this should also be factored into picking a datacenter provider. You will also need to obtain your own AS Number as well as a single block of IP Addresses.

With quality internet peering connections in two datacenters and a private MPLS connection between sites, and AS-Number and IP Block its time to put it all together. In the example shown below you will use your AS-Number to advertise you IP block as a route to both datacenters which means both datacenters will respond to the same IP addresses, this is key. Next you will build identical application server farms that process requests in the same manner with one difference. Each application server farm will use connection strings for database servers that are locally present in each site. The last component is each database server to be configured for peer-to-peer merge replication.

image

As illustrated above each datacenter is advertising the same network and service then over the private MPLS network Peer-to-Peer transactional merge replication is used to keep the databases in each site in sync and then using database mirroring for availability in each datacenter. It is worth noting advertising the same network in both datacenters is not load balancing and you should be prepared in this design for all traffic to be routed to one datacenter. In fact by removing an advertisement to a datacenter you can maintenance the entire datacenter.

Another effect of multiple route advertisements for the public IP block is a symptom similar to geographical location services whereby users would connect to the closest datacenter, in this case they would end up routing to the closest datacenter by way of shortest path on the internet rather then by physical geographical locations.

Multiple Datacenter Topology

Extending on the two datacenter topology in the multiple datacenter topology add a concept of leaf nodes as shown below.

image

Leaf Nodes which has a similar in physical connectivity to a datacenter only provide have the application service and depend on the private MPLS network for connectivity to either datacenter. While your datacenter may be a private cloud solution your leaf nodes would be a ideal candidate to exist within shared managed virtual infrastructure where you only need to manage your configuration on your virtual machine instances. With leaf node is possible to improve services in broad geographical regions as shown in the example below.

image

Time Warner Internet Outage


For the past week Time Warner Cable (Road Runner, Adelphia) had two internet issues. The first of which and also the root cause was related to bad firmware update that also effected Time Warner Cable destabilizing their Boarder Gateway Protocol (BGP) mesh. http://www.eweek.com/c/a/IT-Infrastructure/Bug-in-Juniper-Router-Firmware-Update-Causes-Massive-Internet-Outage-709180/ The second of which was how Time Warner handled this  BGP issue. They deleted and recreated entire Autonomous System (AS) numbers which represent large segments of the internet, which is one way to fix this but it is kind of like turning of the power to an entire city to turn off the power to a single street light. As a result Time Warner users could route to destination networks on the internet but networks were unable to establish a return route in-short because Time Warner deleted and recreated those parts of the internet. Lots of major internet providers had the same root issue and they were able to recover and restore services an a timely manner, Time Warner on the other hand just couldn’t seem to get their act together.