As covered in a previous blog post, you are going to need to become familiar with troubleshooting in general, and specifically what we wish to cover today a little more in depth is using UCM traces to troubleshoot a problem.
Say you have an issue going on – such as maybe you login to a Proctorlabs.com Voice vRack session that you had booked, you hit the “Load Lab Configs” button on the WebUI and proceed to load up the latest lab you wish to work on, say IPexpert Volume 1 Workbook >>> Lab 5A >>> INITIAL, and after the routers/switch and finally UCM Pub and Sub all restore their configuration and reboot, and you go to login to the UCM WebUI and do a “Find” for all Phone Devices, and you find that some of the hardware IP Phones that are physically located at Proctor Labs data center attached to your physical switches are showing up as “Rejected” in trying to register – how do you fix that issue?
Well at first it might seem easy enough to simply dive down into the phone device and change the MAC address and do a reset on the phone, and maybe that fixes it 90% of the time and you can go on with your lab. But what about the 10% of the time that it doesn’t fix it? How do we know what is causing the device to be rejected in its attempt to register, as well, how do we fix it?
If we want to know what UCM is “thinking” at any given time, we of course turn to the beloved traces. So of course to enable the traces we need to navigate away from the CCMAdmin UI over to the Unified Serviceability UI, and go to Trace >>> Configuration >>> choose a Server (makes no difference which if we later check “Apply to All Nodes”) >>> choose CM Services >>> choose Cisco CallManager >>> click Apply to All Nodes >>> choose the Debug Trace Level of Detailed >>> and finally uncheck all but what you are interested in troubleshooting on this go-round with the traces – in this particular case – All IP Phone Devices.
Remember to SSH into the CUCM server that is being used as the primary CPE for the device we are looking for insight into since, of course, this will be the server where data where primarily be logged to in regards to traces. So in our case the CUCM-Sub is the server where our phones are trying to register and being rejected – so we will SSH into that server.
Marks-MacBook-Pro-17:~ mxs$ ssh -l admin 10.10.210.11 admin@10.10.210.11's password: Last login: Fri Mar 13 13:17:40 2009 from 10.10.0.7 admin:
We of course recall how to see which files are the newest and therefore being actively written to.
admin:file list activelog /cm/trace/ccm/sdi/ ccm00000001.txt ccm00000002.txt ccm00000003.txt ccm00000004.txt ccm~num.bin dir count = 0, file count = 4 admin:
Now we could very well use the ‘file view’ command as we discussed in a previous post and simply assume the phone has been trying to register to the UCM for a certain period of time, then moving to the ‘e’nd of the file and then simply looking at one or two ‘p’revious screens should be enough to find the misbehaving phone (or mis-configured UCM as the case typically is). Here we will opt to use the ‘file tail’ command to view the real-time feed of logging data into file 4, and then to trigger the registration to occur immediately, we will log into the BR1-RTR device and shut down the ESW switchport that BR1-Ph1 is attached to, and then bring it right back up again. Below we have excerpted the poignant information so as not to completely overwhelm you (or the blog post) with information. Apologize in advance for the overflow of characters to the right of the screen –obviously traces aren’t the most blog-friendly.
admin:file tail activelog /cm/trace/ccm/sdi/ccm00000004.txt 12/30/2008 00:12:55.856 CCM|//SIP/SIPUdp/wait_UdpDataInd: Incoming SIP UDP message size 597 from 10.10.201.254:[50381]: REGISTER sip:10.10.210.11 SIP/2.0 Via: SIP/2.0/UDP 10.10.201.254:5060;branch=z9hG4bK25dccd08 From: <sip:AUTO-REG@10.10.210.11>;tag=001bd4c6c26e0005488e0805-2c0b2acb To: <sip:AUTO-REG@10.10.210.11> Call-ID: 001bd4c6-c26e000c-3cddcc58-21324f66@10.10.201.254 Max-Forwards: 70 CSeq: 105 REGISTER User-Agent: Cisco-CP7960G/8.0 Contact: <sip:AUTO-REG@10.10.201.254:5060;transport=udp>;+sip.instance="<urn:uuid:00000000-0000-0000-0000-001bd4c6c26e>";+u.sip!model.ccm.cisco.com="7" Supported: replaces,join,norefersub,X-cisco-callinfo,X-cisco-service-control Content-Length: 0 Expires: 3600 |<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190787><IP::10.10.201.254><DEV::><LVL::State Transition><MASK::20000> 12/30/2008 00:12:55.856 CCM|EnvProcessUdpPort - EnvProcessUdpHandler::fireSignal() varId = 1|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::><LVL::Arbitrary><MASK::0800> 12/30/2008 00:12:55.856 CCM|EnvProcessUdpPort - EnvProcessUdpHandler::send(buff, 324, 10.10.201.254:5060)|<CLID::StandAloneCluster><NID::10.10.210.11><LVL::Arbitrary><MASK::0800> 12/30/2008 00:12:55.856 CCM|//SIP/SIPHandler/ccbId=109391/scbId=0/findDevicePID: Routed to SIPStationInit|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::><LVL::Detailed><MASK::20000> 12/30/2008 00:12:55.857 CCM|SIPStationInit: Cisco Phone [Model=7] from Contact|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::><LVL::Detailed><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationInit: connID=0, SEP001BD4C6C26E, 10.10.201.254:5060, Removed token queue entry. Token queue size is 0|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::><LVL::Significant><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationInit: connID=0, SEP001BD4C6C26E, 10.10.201.254:5060, TOKEN REGISTRATION ACCEPTED, regCount=0|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::><LVL::Detailed><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationInit: connID=0, SEP001BD4C6C26E, 10.10.201.254:5060, Token Registration Accepted|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Detailed><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationInit: connID=0, SEP001BD4C6C26E, 10.10.201.254:5060, DevStat-InitState: TOKEN_GRANTED --> REGISTERED|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Detailed><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationInit: RegQueueSize=0 MaxEntriesToProcess=40 tokeMax=10 tokeOut=0 Queue Stats|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Detailed><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), , UNKNOWN:0, primaryDN=UNKNOWN, Primary expires 120, secondary expires 3600|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,45,58933.1><IP::><DEV::><LVL::Significant><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), , 10.10.201.254:5060, primaryDN=AUTO-REG, userStart chars @ 0 .|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Arbitrary><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), , 10.10.201.254:5060, primaryDN=AUTO-REG, setUaTypeAndCepn: uaType is CISCO_PHONE (model is 05/12/40/60)|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Arbitrary><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), , 10.10.201.254:5060, primaryDN=AUTO-REG, wait_register_SIPRegisterInd: model=7, mUaType=2|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Significant><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), , 10.10.201.254:5060, primaryDN=AUTO-REG, Register instanceId (MAC addr) =001BD4C6C26E|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Significant><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), , 10.10.201.254:5060, primaryDN=AUTO-REG, Register deviceName (database key) =SEP001BD4C6C26E|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Significant><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), SEP001BD4C6C26E, 10.10.201.254:5060, primaryDN=AUTO-REG, DevStat-Start : transport UDP, model 7|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Special><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), SEP001BD4C6C26E, 10.10.201.254:5060, primaryDN=AUTO-REG, parseSupportedHeader: X-cisco-callinfo=T X-cisco-serviceuri=F X-cisco-escapecodes=F X-cisco-service-control=T X-cisco-srtp-fallback=F X-cisco-monrec=F X-cisco-xsi=F xsi-version=0.0.0 X-cisco-sis=F sis-version=0.0.0 extended-refer=F norefersub=T join=T co|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Detailed><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), SEP001BD4C6C26E, 10.10.201.254:5060, primaryDN=AUTO-REG, setOptionsIndicationDefaultOptions: INFO - user agent 2|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Detailed><MASK::0020> 12/30/2008 00:12:55.857 CCM|SIPStationD(2,100,45,58933), SEP001BD4C6C26E, 10.10.201.254:5060, primaryDN=AUTO-REG, setOptionsIndicationDefaultOptions: ERROR - Unexpected user agent 2|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Error><MASK::0020> 12/30/2008 00:12:55.858 CCM|CcmdbStationRegistrationProfileBuilder::getBasicRegistrationProfile::init() - failed rc(2)|<CLID::StandAloneCluster><NID::10.10.210.11><LVL::Error><MASK::ffffff> 12/30/2008 00:12:55.858 CCM|Auto Registration Stored proc - execute procedure DeviceAutoReg_cmsp('SEP001BD4C6C26E',7,11,'90ec5d97-e4ee-4538-84dd-5d8f423cf1b2')|<CLID::StandAloneCluster><NID::10.10.210.11><LVL::Detailed><MASK::ffffff> 12/30/2008 00:12:55.859 CCM|EnvProcessUdpPort - EnvProcessUdpHandler::fireSignal() varId = 1|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190787><IP::10.10.201.254><DEV::><LVL::Arbitrary><MASK::0800> 12/30/2008 00:12:55.859 CCM|EnvProcessUdpPort - EnvProcessUdpHandler::send(buff, 339, 10.10.201.254:5060)|<CLID::StandAloneCluster><NID::10.10.210.11><LVL::Arbitrary><MASK::0800> 12/30/2008 00:12:55.864 CCM|AddDevice returns Error: Autoregistration disabled|<CLID::StandAloneCluster><NID::10.10.210.11><LVL::Detailed><MASK::ffffff> 12/30/2008 00:12:55.868 CCM|ProcessDb - ERROR initializing_DbStationAutoRegisterReq - addDevice failed|<CLID::StandAloneCluster><NID::10.10.210.11><CT::2,100,186,1.190786><IP::10.10.201.254><DEV::SEP001BD4C6C26E><LVL::Error><MASK::0800> 12/30/2008 00:12:55.868 CCM|TspError - Error in TSP. Port IsoEthPort:0 Port DSL:0 Name of Device:SEP001BD4C6C26E App ID:Cisco CallManager Cluster ID:StandAloneCluster Node ID:UCMSub|<CLID::StandAloneCluster><NID::10.10.210.11><CT::AlarmSEP001BD4C6C26E><DEV::SEP001BD4C6C26E><LVL::All><MASK::ffff>
Here we see that the SIP phone at BR1 is sending us a REGISTER message. One thing to note about SIP if you don’t know this already is that per the SIP RFC, SIP phone devices must send the SIP Registrar Server they wish to register with, both their Line ID (DN) and a hashed password for that line. That is to say that they must authenticate every line on their device in order to use said line. This actually adds great flexibility comparing them to a SCCP phone device, since unlike a SCCP phone, a SIP phone can actually register different lines on the same physical device to different servers or clusters of servers. Anyhow continuing to look down through the trace we find that the SIP Phone is trying to register its primary line with the DN=AUTO-REG. It is in a sense saying that it doesn’t have a DN assigned to its first line, and is therefore trying to auto-register with the UCM. UCM in return (after letting the SIP phone give it a LOT of information about itself) is saying that “Autoregistration is disabled” and therefore the UCM action of “addDevice” to the UCM DB ultimately fails.
Now we could turn on auto-registration on the UCM servers and look at the traces again. I will save you from the output of such a task, and instead walk you through what happens. If we did do that, we would first need to remember to not only turn on auto-registration, but also remember to check the Enterprise Parameters to ensure that SIP auto-registration (which allows SCCP auto-reg as well) was the preferred type. Even if we did all of that, what we would see in the traces was that while auto-reg was supported and working, there would be a DB Error in trying to add the device to the Informix DB. The reason for this is simple – there is already a device in the DB with that MAC address. Then “why wouldn’t it register with the device already in the DB?” you might ask yourself. “Was there a phone type mismatch?” No – the phone type is correct, as well as the protocol, SIP, is correct. So why then? This gets into a nuance that differs greatly between SIP and SCCP phones – and it is right back to what we mentioned in the last paragraph. A SIP phone tries (and must authenticate) its primary DN against one that already exists in the UCM DB associated to the same device type and MAC address. In this case the SIP phone is telling the UCM its primary DN is AUTO-REG. The UCM however has that phone type and MAC in the DB with the primary line as 1001, and not a blank DN (auto-reg is looking for no configured DN) and therefore won’t allow the phone to register. We can do one of two things to easily fix this. One option is to delete the SIP device with that particular MAC address and allow it to re-register, and then put all of the settings back in that were already set on that previous device, however that assumes that we know (or remember) what all the settings were. The other (and better) option would be to simply dive into the configuration for the existing SIP device, change the MAC address to something else (all F’s would do just fine), allow the SIP phone to create a new device in the UCM Informix DB, and then with two web browser pages open (one on the old device with all Fs and the other on the new device newly created), copy the MAC address from the new device to the old device web page, (also updating the line on the old to whatever the value on the new is) and then quickly delete the new phone and update the old phone. Of course if you weren’t fast enough you could always perform the above with the “Cisco CallManager” service stopped on all servers in the cluster.
By the way, one more note when allowing SIP phones to auto-register. They will show up in the CCMAdmin WebUI quickly, but will show up initially as “Rejected” even though they are brand new auto-registering. Give them some time – they will work things out with the UCM. Then perform the above steps to get them into the proper state that you want them in.
Obviously from the above, knowing what to look for in the massive amount of trace information returned obviously has it rewards, and therefore going to the traces more often during your self-study when issues arise will pay dividends in the real lab for the need to quickly find something should just one issue arise.
Cheers,
Mark




