Oct 182013

In a previous post, I talked about an experience I had in documenting for co-workers how to set up the CS1000E. The root cause of that documentation was the excessive amount of time I spent cleaning up after a field person who refused to install this properly and the subsequent complaints from customers on why the phone system went down during maintenance windows.

Having the ability to add system redundancy (or resiliency) does not necessarily mean that a customer requires said redundancy, but sometimes the lack of the redundancy is not a factor in the customer’s thinking. Sometimes, knowing you can prevent something (if you pay the associated costs) is not worth the money or time.

This is a different decision from arguing that something doesn’t work a particular way– and today I ran into this problem with a customer who did not have the necessary redundant network connections and experienced an outage as a result.

In this particular case, either the cable and/or data switch port went bad. Had the customer installed the necessary redundancy, the failure of a single port would not have been noticed and the system would have kept on trucking.

As part of the post event discussion, I walked them through how the architecture supported additional redundancy and the extent to which that redundancy can be expanded. I decided to work up a diagram to more fully explain what I was talking about.

This diagram shows a CPPM CS (Call Processor Pentium Mobile – Call Server) connected to the passthrough port on the MGC. The passthrough port permits you to simulate increased CS redundancy to the ELAN network by passing though to either active MGC ELAN interface.

The downside of this connection is that if the MGC undergoes maintenance, or the cable goes bad, you still have a single-point-of-failure.

I would do this primarily only when the environment is also going to deploy redundant TLAN/ELAN data switches for increased network resiliency. Otherwise, connecting the CS directly to the ELAN network makes more architectural sense to me. (That way if you’re installing loadware on the MGC associated with the CS, you don’t cause outages to the entire system when the MGC is rebooted– although there are architectural decisions that can be made to work around some of that as well but we’re not going to cover every possible scenario in this article. Please feel free to comment below to engage in a discussion if you have questions or want to share your observations.)

The diagram also shows the redundant connections from the MGC (faceplate & rear-chassis) connected to a redundant data network. NOTE: I do not show the data switch connectivity with the rest of the network. That’s sort of beyond the scope of the CS1000 resiliency feature. You can, I’m sure, get the gist of it from this article.