How to build a Distributed Service Delivery Network


The internet has inherit problems in its physical connectivity and routing topologies a wide range of issues may occur that impact global availability of a service. Some examples include denial of service attacks that effect a segment of the internet or routing issues within a peer internet service provider. In any case the effect simple a percentage of computers are unable to reach the service destination. It is possible to solve this problem by constructing a DSDN (Distributed Service Delivery Network).

Here is an example for building a Distributed Service Delivery Network where the majority  of the users are in North America using datacenter service providers for both co-location and virtual machines on managed shared infrastructure.

Peering Relationships

When selecting a facility provider it is very important to evaluate the internet peering relationships that exist there the their distance to the core of the internet topology, CAIDA has done research in this area and is a great resource as shown below.

It is worth noting the following Autonomous System (AS) Numbers are dominant networks on the internet and you will want to measure the distance to these networks:

  • AS3356 Level 3 Communications
  • AS6939 Hurricane Electric
  • AS3549 Global Crossing Ltd
  • AS6461 Metromedia Fiber net
  • AS3257 Tinet SpA
  • AS1239 Sprint
  • AS2914 NTT America, Inc
  • AS174 Cogent/PSI
  • AS1299 TeliaNet Global Network
  • AS7018 AT&T Service Inc

The must have connections are AS3356, AS6939, AS174 and AS7018 and I would recommend they be first peer connections. There are multiple ways to measure these relationships such as trace routes or by route view looking glass services, I prefer to use BGPlay.

Two Datacenter Topology

It’s possible to achieve good IP Availability between two datacenters sites the common challenge in any design is ensuring consistent connectivity between datacenters. A dedicated Multiprotocol Label Switching (MPLS) private routing network with packet prioritization will meet this requirement. It is very important to select a single backbone carrier for your MPLS network when possible as such this should also be factored into picking a datacenter provider. You will also need to obtain your own AS Number as well as a single block of IP Addresses.

With quality internet peering connections in two datacenters and a private MPLS connection between sites, and AS-Number and IP Block its time to put it all together. In the example shown below you will use your AS-Number to advertise you IP block as a route to both datacenters which means both datacenters will respond to the same IP addresses, this is key. Next you will build identical application server farms that process requests in the same manner with one difference. Each application server farm will use connection strings for database servers that are locally present in each site. The last component is each database server to be configured for peer-to-peer merge replication.

image

As illustrated above each datacenter is advertising the same network and service then over the private MPLS network Peer-to-Peer transactional merge replication is used to keep the databases in each site in sync and then using database mirroring for availability in each datacenter. It is worth noting advertising the same network in both datacenters is not load balancing and you should be prepared in this design for all traffic to be routed to one datacenter. In fact by removing an advertisement to a datacenter you can maintenance the entire datacenter.

Another effect of multiple route advertisements for the public IP block is a symptom similar to geographical location services whereby users would connect to the closest datacenter, in this case they would end up routing to the closest datacenter by way of shortest path on the internet rather then by physical geographical locations.

Multiple Datacenter Topology

Extending on the two datacenter topology in the multiple datacenter topology add a concept of leaf nodes as shown below.

image

Leaf Nodes which has a similar in physical connectivity to a datacenter only provide have the application service and depend on the private MPLS network for connectivity to either datacenter. While your datacenter may be a private cloud solution your leaf nodes would be a ideal candidate to exist within shared managed virtual infrastructure where you only need to manage your configuration on your virtual machine instances. With leaf node is possible to improve services in broad geographical regions as shown in the example below.

image

Evolving Monitoring for Cloud Network Fabrics


With the rapid global growth in the consumerization of information technology monitoring your application or service has grown increasingly complex. Simply monitoring your solution from one or a few location doesn’t provide effective visibility into the global availability nor does it identify were connectivity issue may be occurring within the internet.

I evaluated options such as services that monitoring your application from multiple locations like AlertSite and solutions that generate statistical analysis like Google Analytics but couldn’t monitor internet fabrics. The real-time web monitor at Akamai was closer but didn’t provide enough granularity. I needed to report & filter my monitoring data by the following data dimensions so I build a ETL data warehouse solution.

  • Date time
  • Client public IP address
  • Number of workstations behind each public IP address
  • Client internet service provider
  • Client location represented as geospatial longitude and latitude
  • Client location represented as City, Region, Country
  • Client public IP reverse DNS lookup
  • Client average response time
  • Correlated relationships between client ISP networks (ARIN)
  • ARIN network identity information
  • A lifecycle that dynamically adds and removes data
  • Identify BGP prefix’s within AS numbers and dynamically generate routing topologies and monitor for changes
  • Collect performance and availability times for all internet fabric paths and dynamically represent that data
  • Collect real internet fabric network topologies by identifying actual routes that packets flow
  • Granularity to the point that a single client internet connection can be identified and analyzed all the way back to the service endpoint.
  • Broad reporting to alert and identify broad issue with peering relationships, and the internet in general.

Here is screenshot of what the beta of this technology looks like, the example below is clients connecting to services at iQmetrix.

clip_image002