Updated my Google doc table of IP Phone firmware:
Updated my Google doc table of IP Phone firmware:
When IP Phones enter a reboot loop, attempt to “upgrade”, fail, then reboot again, or
When IP Phones enter a reboot loop, attempt to “upgrade”, fail with “FW authentication failure”, then reboot again
UNIStim 5.0 or earlier
Avaya IP Phone 1100, Avaya IP Phone 1200
UNIStim firmware is digitally signed.
Signature has an expiration date.
UNIStim versions prior to 5.0 had shorter expiration dates.
New IP Phone hardware will not load firmware with expired signatures.
Use UNIStim 5.1 or later firmware.
Avaya has applied a digital signature with a 10 year expiration date to UNIStim 5.1 and later.
I updated my Google drive table of UNIStim firmware releases.
In a previous post, I talked about an experience I had in documenting for co-workers how to set up the CS1000E. The root cause of that documentation was the excessive amount of time I spent cleaning up after a field person who refused to install this properly and the subsequent complaints from customers on why the phone system went down during maintenance windows.
Having the ability to add system redundancy (or resiliency) does not necessarily mean that a customer requires said redundancy, but sometimes the lack of the redundancy is not a factor in the customer’s thinking. Sometimes, knowing you can prevent something (if you pay the associated costs) is not worth the money or time.
This is a different decision from arguing that something doesn’t work a particular way– and today I ran into this problem with a customer who did not have the necessary redundant network connections and experienced an outage as a result.
In this particular case, either the cable and/or data switch port went bad. Had the customer installed the necessary redundancy, the failure of a single port would not have been noticed and the system would have kept on trucking.
As part of the post event discussion, I walked them through how the architecture supported additional redundancy and the extent to which that redundancy can be expanded. I decided to work up a diagram to more fully explain what I was talking about.
This diagram shows a CPPM CS (Call Processor Pentium Mobile – Call Server) connected to the passthrough port on the MGC. The passthrough port permits you to simulate increased CS redundancy to the ELAN network by passing though to either active MGC ELAN interface.
The downside of this connection is that if the MGC undergoes maintenance, or the cable goes bad, you still have a single-point-of-failure.
I would do this primarily only when the environment is also going to deploy redundant TLAN/ELAN data switches for increased network resiliency. Otherwise, connecting the CS directly to the ELAN network makes more architectural sense to me. (That way if you’re installing loadware on the MGC associated with the CS, you don’t cause outages to the entire system when the MGC is rebooted– although there are architectural decisions that can be made to work around some of that as well but we’re not going to cover every possible scenario in this article. Please feel free to comment below to engage in a discussion if you have questions or want to share your observations.)
The diagram also shows the redundant connections from the MGC (faceplate & rear-chassis) connected to a redundant data network. NOTE: I do not show the data switch connectivity with the rest of the network. That’s sort of beyond the scope of the CS1000 resiliency feature. You can, I’m sure, get the gist of it from this article.
A co-worker was recently tasked with providing cross training for a product which I do not have much experience with on the topic of T1 troubleshooting and alarm clearing. After getting this class, I decided it would be fun to put together something similar (in blog format) for CS1000 T1 alarm clearing.
There are a variety of PRI alarms, but we’ll take one of them as an example:
Some systems have part of the alarm lookup database on-system which can be accessed via the Overlay Loader (OVL000 and a > prompt) using the ERR command
If the alarm library is loaded with that alarm, then you’ll get the help text. If not, you’ll get an error:
OVL441 Help text not found for error code: [code]
All alarms in the documentation are in 4 digit length after the 3 letter alarm group code. DTA are digital trunking alarm. 021 is the specific alarm. Finding it in the documentation can be done by searching for DTA0021. From the documentation we get the alarm text:
Frame alignment alarm persisted for 3 seconds
Let’s talk briefly about some of the different tools available for troubleshooting:
DTI and PRI diagnostics (LD 60) cover a variety of tasks, you can: enable/disable loops, clocking, individual bearer channels (B-channels or BCH) and print/clear counters. For a full list of commands, see the Software Input/Output Reference – Maintenance (NN43001-711).
STAT [loop channel]
|Show status of all loops or loop specified. Loop status include loop state and BCH state.|
SSCK [loop shelf]
|Show system clock. Includes which circuit is being used for primary clocking and clock state.|
|SWCK||Swap clock from current active to current standby|
|TRCK [Source]||Set clock controller tracking to PCK/Primary Clock, SCK/Secondary Clock, FRUN/Free run-no clock.|
|ENLL [loop]||Enable loop|
|ENCH [loop channel]||Enable B-channel|
|DISL [loop]||Disable loop|
|DISI [loop]||Disable loop when idle. Disables any IDLE channel then waits till other channels are disabled. Loops until all B-channels are disabled then disables loop.|
|DSCH [loop channel]||Disable B-channel|
DCH Diagnostics covers: enable/disable d-channels, d-channel monitors, and work with MSDL or TMDI cards. On larger legacy 1000M systems, the Multipurpose Serial Data Link (MSDL) card is used to provide D-channel functionality. On smaller 1000M systems and newer 1000E systems, the D-channel functionality is built into the TMDI (T1 Multipurpose Digital Interface) card. In this article, we will not be discussing troubleshooting D-channel diagnostics for MSDL cards on larger 1000M systems.
STAT DCH [dch]
|Show status of all/specific DCH.|
|ENL DCH [dch]||Enable DCH|
|DIS DCH [dch]||Disable DCH|
|STAT TMDI [card]||Show status of TMDI. (CS1000M small system)|
|STAT TMDI [loop shelf card]||Show status of TMDI. (CS1000E)|
|DIS TMDI [card]||Disable TMDI. (CS1000M small system)|
|DIS TMDI [loop shelf card]||Disable TMDI. (CS1000E)|
|SLFT TMDI [card]||Selftest TMDI. (CS1000M small system) Performs multiple hardware tests to verify TMDI is functional.|
|SLFT TMDI [loop shelf card]||Selftest TMDI. (CS1000E) Performs multiple hardware tests to verify TMDI is functional.|
|ENL TMDI [card [fdl]]||Enable TMDI. (CS1000M small system) Optional FDL/Full Download of TMDI EPROM.|
|ENL TMDI [loop shelf card [fdl]]||Enable TMDI. (CS1000E) Optional FDL/Full Download of TMDI EPROM.|
|PLOG DCH [dch]|
The command architecture for the CS1000 is built on the older Meridian-1 systems, which in turn is built upon the even older SL-1 systems. When the SL-1 hardware architecture was replaced or improved, Nortel introduced new commands or Overlays as needed, all while keeping the essential command structure introduced with the first SL-1 system in the mid-1970s.
Print Routine 1 covers peripheral programming, including the bearer channel (BCH) configuration for a T1. From the Terminal Number configuration of a BCH, it is possible to identify the route membership for a particular channel, and by extension the T1. (While it is technically possible to configure different channels within a T1 to belong to multiple routes, I’ve never seen this and excepting MUXed circuits I am not aware of any reason why it might be done.)
Print Routine 2 covers customer datablock configurations, including route datablock (RDB) settings. By using the List Trunk Members command, it is possible to identify all of the BCH (and by extension all the T1s) that belong to a particular route (i.e., trunk group).
Print Routine 3 covers hardware and system configuration data, such as the Common Equipment (CEQU) datablock and Action Device and Number (ADAN) datablock, the latter of which is used to store information about D-channel configuration.
When building a PRI/DTI in a CS1000 system for the first time, the Clock Controller and Alarm Threshold values must be set. For systems in the USA, the DDB (digital data block) configuration record contains the relevant configuration settings.
LD 60 and 96 are used primarily for diagnostics. Overlays 20-22 and 73 would be used to configuration review to assist with diagnostics. For the purposes of this article, we will assume that a configuration issue is not at fault. Perhaps in some future article I might cover PRI configuration in more detail.
NN43001-611 Software Input/Output Reference – Administration
NN43001-711 Software Input/Output Reference – Maintenance
NN43001-712 Software Input/Output Reference – System Messages
NN43001-301 ISDN Primary Rate Interface Installation and Commissioning
Increasing boot efficiency is one of those things I’m working on. My personal or work PC, my IP Phone, systems I manage. The less time I have to spend sitting around waiting for something to boot up is more time doing something productive. On the PC, that involves looking at your startup folder, your registry run folders and removing any unnecessary services from automatic startup.
For Avaya CS1000 IP Phones, that involves looking at the config and determining which features can be added or removed to achieve an optimal boot up sequence.
Although my 4st post is not live yet (when it is, it will be here), in it I cover Link Layer Discovery Protocol (LLDP) and how it applies to Avaya CS1000 IP Phone deployment. On of the biggest inefficiencies I’ve found in CS1000 IP Phone deployments is where customers leave LLDP enabled but don’t use it.
Waiting for LLDP-MED (Link Layer Discovery Protocol, Media Endpoint Discovery) can add as much as 30 seconds delay to the boot process… So disable it if you’re not using it!
With stickiness, you can configure the Phone to not use LLDP on bootup, or you can disable it manually at each phone by turning it off.
On the other hand, if use LLDP you might increase boot efficiency by distributing the configuration of the IP Phones and reducing dependency upon DHCP. If you want to configure the Voice VLAN but don’t use LLDP, your options are to manually configure each IP Phone or use the VLAN-A option to assign a Voice VLAN ID.
If you use DHCP though, you’re going to be querying the DHCP server (or multiple servers) multiple times.
It’s certainly faster than waiting for LLDP-MED to time out, but using LLDP-MED is faster than multiple DHCP queries (Although talking a fraction of the delay caused by LLDP-MED being enabled and unused.)
It’s also a good idea to reduce the number of retries to allow the IP Phone to failover to an alternate signaling server (i.e. Connect Server) more quickly.
Recently found myself troubleshooting Untagged and Unregistered frame filtering on an Avaya Ethernet Routing Switch. This is a quick tutorial for future discussions with other engineers about how filtering works on an Avaya Ethernet Switch.
A quick description of tagged vs registered:
In the above diagram, if the packet egressing from the first data switch is tagged with VLAN 10 (PVID or Primary VLAN ID 10), then the packet is both tagged and registered when it ingresses on the second data switch. However, if the packet is tagged with VLAN 20, the packet is tagged but unregistered when it ingresses on the second data switch.
Use of untagged filtering or unregistered filtering is for environments where the administrator wishes to protect against mistakes in recabling network devices or vlan configuration mistakes. One of the historical issues that has happened in the past with Nortel Ethernet Routing Switches is that if an unregistered frame is received and the receiving data switch does not know what to do with it, it may divert the packet to VLAN 1 (the default VLAN ID on all data switches). This can result in accidental broadcasts of extraneous packets.
A few examples of network changes that might result in problems:
Michael McNamara posted an article back in 2007 discussing an issue with Avaya IP Phones where filter-unregistered-frame enable caused problems with IP Phone registration. The relevant excerpt is as follows:
The option (vlan ports 1-46 filter-unregistered-frames disable) was added after an issue was discovered when trying to upgrade the firmware on the IP phones. The filter-unregistered-frames is enabled by default and should be disabled to avoid and issues with upgrading the firmware on the IP phones. We are attempting to investigate further with Nortel and our voice vendor…
What to do with this information:
In my 3rd company blog article, I talked about DHCP and auto configuration via DHCP. There are fifteen different feature groups (IP Deskphones Fundamentals, NN43001-368, Appendix B: Provisioning the IP Phones) and 100+ different settings configurable via DHCP (I didn’t count them, I’m estimating). Some of the gotchas I’ve learned over the years are as follows:
Another thing that I find extremely helpful is learning the provisioning cycle and the status messages which display on the IP Phone during boot up:
|Provisioning Step||Display message|
|LLDP||Waiting for Cfg Data…|
|Manual Provisioning||Prov. 192.168.0.254 (or whatever IP you have)
(system.prv failed may display)Attempting TFTP…
|Registration (pre-connect)||Connecting to S1
Connecting to S2
|Registration (post-connect)||Connect Svc
Node & TN
This way, if you get stuck at a particular phase, you know where and can use that to determine your next troubleshooting step. For example;
Last week I talked about an overview of the registration process and the different phases for the Avaya IP Phone registration process. This week my contribution to the company blog covers the essential information needed to deploy any IP phone. Understanding IP Phone deployment, essential information needed for all deployment scenarios. Part two of a six part series.
I start with the assumption that you already know a little bit about the CS1000 architecture and programming– I tried to keep this article to 600-800 words, and even discounting some of the captures I put in the article it still came out to almost double the size. To do a really in depth review of this topic you need a lot more than just a six part series. But, I have plenty of other topics that need attention and while IP Phone deployment is an interesting topic (to me, because of the number of features available compared to those that are implemented in most sites) as a topic, it’s just one of many. Writing white papers is best left to people who have that as a job– amiright?
Unlike completely creative writing, the technical writing comes very easily. The problem with creative writing is always figuring out the answers that you don’t even really understand the questions to– if X happens, how does Y feel about it, how do they react, etc. When a writer gets stuck, often the problem is that either they’re not able to answer the question, or they may not even realize what question they need to answer. But, with technical writing, all you have to be is knowledgeable about your topic and have time to flip through the documentation.
Trimming the fat
But, I removed a lot of in depth detail that I started to slide into the original blog article and instead referenced the documentation. I figure I trimmed at least another 800 words from the article that gave a lot more detail on the use of the Nortel-i2004-A string, as well as a few other B-string mnemonic that I really like when I’m in charge of IP Phone deployment (or tasked with consulting with a customer.)
Trimming takes at least two passes.
Speaking of documentation, it was a challenge getting some of this history straight. I was around during the UNIStim 1.x days (we’re at 5.4 now), but the documentation doesn’t really talk about when features were implemented, changed or removed.
For instance, it wasn’t until 2.2-2.3 that the new Nortel-i2004-B string was introduced, but the latest UNIStim 5.4 documentation doesn’t say when it was introduced. This is important because if you’re on firmware prior to 2.3, you might not be able to support the B string format (I know this is unlikely– but I know of customers that are still on Meridian-1 software that was released before I got into Telecom in 1999.)
In addition to filling in the facts that I could validate through documentation, I also had to make sure that I hadn’t introduced any errors through typos or omissions. Being thorough takes a couple of passes.
Selecting images to fill out a blog is tough for me– I struggle with the artistic creativity required for selecting images that work with the blog post. And, when I cannot find what I’m looking for (and when a google search fails me), I whip something up in Photoshop. You might think that’s creative, but it doesn’t feel that way.
Then, it’s on to getting the text formatting right. Another couple of passes there. Headers are properly coded with the correct H1,H2,H3 tag, words and phrases get the correct bold/underline/italics emphasis. Colored text, when used, can convey information subtly. (e.g., On my company blog articles, I use a dark green bolded text format to represent terms which are universal and not specific to the subject of my blog.)
Then I submit the blog post to the company approval process and if it gets approved, we schedule it for posting.
And thus, Part 2, IP Phone deployment, essential information goes live.
The full series will be:
The IP Phone Registration process articles are some of the most visited articles on my personal blog. So when my employer began a new initiative asking employees to contribute content to the business blog, I proposed and was greenlit to rewrite and publish my IP Phone Registration overview as a six part blog series.
There will be some rehash of information previously posted here, but there will also be parts that haven’t yet been covered.
The full series will be:
Next week, I anticipate getting part 2 approved and published.
Now, for the “Call to Action”– I get special recognition at my job if my articles happen to be the most viewed articles over the next 6 months (compared to any of my co-worker’s articles.) So, I know this topic is popular and I’m hoping you’ll be tempted by more information on this topic.
Go ahead– do us both a favor and click through to the first article Understanding IP Phone deployment, registration process overview.
If you’d like to influence future articles, register to stay informed and ask your questions either via the registration process or the comments section via the Comments section.
Ever have a massive block of D-Channel (DCH) traces that you need to parse into something meaningful? I do. In fact, for a couple of customers it’s become a regular occurrence the last few months. So, in an effort to make life easier for myself I wrote a nifty little parse utility then converted it to a PHP/HTML page.
My web hosting company has some hard limits– like you can’t use more than 30 seconds of CPU time without a script being killed… so super-long parse logs is out of the question. (They’ll simply get killed by the web host.)
Also, to ensure that it doesn’t get hammered, I wrote into the code a lock so that only one-person can access it at the same time. There isn’t yet an ability to IP Ban, but that will get added– so that if anyone hammers this I can block them from being able to use it.
So, a few etiquette requests: