Adventures in upgrading to System Release 10.5

I took some time to help a customer upgrade their system from CSR (Cisco System Release) 9.1 to 10.5 recently.  As any good upgrade goes, it wasn’t without some significant drama….

[In particular, they were upgrading from CUCM 9.1(2) to CUCM 10.5(2);  CUCM IM&P 9.1 to 10.5(2);  CUC 9.1(2) to 10.5(2) and UCCX 9.0(2) to 10.6(1); Expressway x8.1 to x8.5]

Expressway-C and E was a textbook upgrade using the .gz upgrade files.

Unity Connection (aka CUC) was the first core component that we chose to bite off because it doesn’t have any version dependencies with the other components.  It was a textbook upgrade without issues.

The minor snag was CUC going unlicensed immediately because it was pointing to ELM on CUCM 9.1 and CUCM didn’t have 10.x licenses installed.  So watch out for that.  It was rather odd that CUC didn’t give us a 60-day grace period.  I moved it to it’s own PLM with appropriate licensing installed there.

We next chose to upgrade UCCX as 10.6(1) is compatible with CUCM 9.1(2). [http://docwiki.cisco.com/wiki/Unified_CCX_Software_Compatibility_Matrix_for_10.6%281%29]

This is a refresh upgrade so you must install a refresh upgrade COP file on UCCX before installing 10.6(1).  Because it is a refresh upgrade the system upgrades the underlying VoS (RHEL-based Voice Operating System) and CCX is down while the system reboots and upgrades the OS.  I selected to have CCX stay on 9.0(2) after the upgrade, so that 10.6(1) is the inactive version and will just need a version switch reboot to go live.

The switch version reboot turned into a bit of a mess.  I issued it but the server didn’t seem to actually do anything.  I issued a CLI utils system reboot command about 2 minutes later and it screamed back that I shouldn’t do that as the system was in a version switch and the database could be corrupted.  I let it sit about 30 minutes and tried again.  This time it rebooted without complaining and came up on 10.6(1).

I had to run the typical process of updating the CAD client (this customer will move to Finesse in the next phase of the upgrade), using the Client Configuration tools you download and install from CCX.

Next was the CUCM publisher.  This ended up being a multi-hour affair.  I’d already heard about the very common problem of the Common partition being full, so I’d taken the liberty to use RTMT to clear out old logfiles.  You basically go to Trace and Log Central in RTMT, select Collect Files, choose a time period (I chose the previous 5 month period) did not ZIP the files (for time sake) and most importantly checked the box to Delete the files from the server.

I had the Common partition down to 50% utilized before the upgrade.

It failed citing the good old bugid CSCuc63312:

There is not enough disk space in the common partition to perform the
upgrade. For steps to resolve this condition please refer to the Cisco
Unified Communications Manager 9.1(1) Release Notes or view defect CSCuc63312
in Bug Toolkit on cisco.com.

So second-guessing myself, I decided to use the sledge hammer known as ciscocm.free_common_space_v1.1.cop.sgn COP file to clean out the common partition (this COP file script nukes the currently installed inactive version).  I gave that a run and rebooted the server.

The next attempt at install also failed with the same CSCuc63312 error!

Knowing that the Common partition wasn’t the issue (which you can see in the install logs that you can copy and paste from the GUI), I started doing some digging around.  It turns out that if the main partition doesn’t have enough room (which you can see in the log files by searching for the word “needed”), it will throw the CSCuc63312 error erroneously.

I found a couple of TAC cases where the next step to clean up room on the main partition is to clean up the TFTP directory.  This customers TFTP directory was over 5GB in size from multiple versions of large firmware for endpoints like the DX650 and Telepresence codec firmware for endpoints like the C40, SX20, SX10, etc.

Even after cleaning up the TFTP folder I was still hit with the same stupid error message:  Not enough space in the Common partition.  Even though I knew Common had plenty of room and I was fighting a main partition space issue.

Some enlightenment

I remembered that from CUCM 10.0 on, the OVA template had increased the disk size from 80GB to 110GB.  It turns out that if you increase the size of the VM’s disk in 10.0 or greater, CUCM will automatically see the space and take advantage of it.  CUCM 9.x doesn’t do this automatically.

The special sauce to get 9.1 to expand the disk is to install the ciscocm.vmware-disk-size-reallocation-1.0.cop.sgn COP file, shut the VM down, resize the VM HDD in ESXi to 110GB and then boot it back up.  CUCM will reboot a couple of times during the process and then come all the way up.

The detailed instructions are here — http://www.cisco.com/web/software/282204704/18582/ciscocm.vmware_disk_size_reallocation_v1.0.pdf

After doing this CUCM 10.5(2) was able to install successfully.  Keep in mind that this is a refresh upgrade so the system will be down for an hour or so while VoS is updated, and the server will go through a couple reboots.

CUCM IM&P (CUP)

I initially tried to install the upgrade ISO and received an invalid checksum error.  Thinking it was the version of IM&P I’d downloaded from CCO (it’s been updated twice in the past week) I re-downloaded and hit the same error.  Note the current version of CUP is 10.5(2a) as of this writing.  If I’d paid attention to the documentation I’d have realized that you need to install the ciscocm.version3-keys.cop.sgn COP File which has the new keys that the 10.5(x) software images are signed with.  After installing this, the upgrade would recognize the ISO and upgrade.

25 thoughts on “Adventures in upgrading to System Release 10.5

  1. As always

    Great and informative posts…I just sent to my engineers so they can start labbing up to prepare.

    Can I ask if the dx was now able to do mra after the expressway upgrade?

    Arvind Gooding CTO (M)868-708-5271 (O) 868-612-4428 ext 1600 (F) 868-223-5222 http://www.undsl.com

    Sent from my mobile, please excuse any typos..thanks and have a great day

  2. I did the upgrade from 9.1 to 10.5 few months ago. I ran into similar space issues with cucm upgrade which I fixed by temporary lowering the watermark to below 50% in RTMT. Another thing that was required is changing vmnic type from standard to vmnic3 type which is done by editing the vmx file. This is very important step.
    Just make sure to keep backups of your old VM’s just in case🙂

  3. Great informative post, this kind of knowledge sharing is priceless, especially when you’re staring down the barrel of closing maintenance window and flying by the seat of your pants.

  4. We went through a very similar process a few months back when we upgraded to 10.5 and got hit by pretty much the same problems as you!

    Slightly off-topic question (can’t find a way to contact you directly?) – UCCX 10.6 does it support SSO for either of the Agents?

    • Let me look on CCX SSO and get back to you. My system is synced to CUCM so the user and password are AD/LDAP synced, but I don’t see any mention of real SSO using an IdP.

  5. Many thanks for the enlightening post.

    Just an additional question, was it also necessary for you to upgrade IP phone firmware and voice gateway router IOS peered with CUCM 10.5?

    • Firmware is automatically updated by CUCM aonce the phone registers to 10.5. You could pre-upgrade the phones if you wanted by uploading the appropriate devpack or individual firmware to the pre-upgrade version of CUCM. In my case I already had the 8945s and DX650s on the latest code for bugfixes, so they didn’t need an upgrade after 10.5 was live. However all of the 79xx models weren’t on the latest and automatically upgraded after they registered to 10.5.

      My gateways were on 15.2.5?.M6 and worked fine before and after.

      I did however upgrade them to the current 15.3.3.M2(?) this week when fighting and MGCP a registration issue. Turns out the is a stupid cosmetic bug in 9 and 10 where MGCP PRIs may show unregistered on CUCM, but actually be registered and fully functional. I wasted hours this week on that!

      • Another followup question Mike.

        Is it necessary to order new licenses for the CUCM 10.x or is there some process to make the existing 9.x licenses on ELM work on an upgraded CUCM 10.x?

      • You will need to get 10.x licenses. ELM will work (rebranded to Prime License Manager) and will see the 9.x licenses, but won’t assign them out to devices/users on 10.x systems.

  6. Mike,

    I am upgrading my lab prior to production. I am all the way to the IM&P servers that are running 9.1.1 SU2, and when I try to upgrade them, I get a version mismatch when installing the ISO. I did upgrade the CUCM servers first to version 10.5.2 and those are running fine. Did I mess up a step? Should I have left the CUCM servers running version 9.1.2 SU1 until I was able to get the upgraded cop file installed on the IMP servers? Doesn’t make sense….I cannot seem to get an install to work for me. I did install the keys cop file you mentioned.

  7. Corey, not sure if you solved your issue or not, but i ran into the same issue last night when performing our upgrade. The fix was to tic the box to “do a switch version after upgrade” at the installation screen. That got around the version mismatch error. Checking “do not switch version” (which is default) for some reason errors out the upgrade. It probably has something to do with IMP and CUCM needing to be on the same base version.

  8. Mike, Great post. I’m doing a UCM 8.62 to 10.52 upgrade later this week. I cloned my Pub and migrated it to a lab box then tried to upgrade it. Hit the same issue you described. My question is when you install the “ciscocm.free_common_space_v1.1.cop.sgn” and resize your disks do you do that only on the pub or on the subscribers as well?

    • Hi Jeff,

      Yes, you’ll need to resize the partitions on the Pub and all Subs. The freespace cop file may or may not need to be run on all servers. You can try the upgrade of the pub first and if it fails with not enough common parition space you’ll know you need to run it. It just depends on the common partition space situation on the pub/subs. I only had to run it on my pub since it was full of logs and the subs weren’t.

  9. This was very insightful. I am beginning the process of scoping out this for our upgrade of CM, CUC, UCCX and ER. Our Presence is hosted with our corporate entity for our Global Jabber deployment. I just trunk up to them and then enter our CM and CUC DNS names in the Jabber client. Therefore I think I am safe since IM and Presence are not integrated with my UC clusters when I upgrade out UC platform. Am I missing anything on that topic?

    I also found some very good DEC’s from Cisco LIVE on best practices for migrating/upgrading previous versions to version 10.5. I was glad I read them and found this.

    I also do not have a lab. However, since I am already virtualized on UCS with Vmware I might be able to get network team to slice out a VLAN for a lab. However, I would not easily be able to clone or backup and restore in a Lab due to clustering and IP addressing, correct?

    Thanks,
    Tom

    • Changing IP addresses is typically catastrophic. 😦

      You know what I’d do… I’d have the network guys setup an isolated non-routed VLAN. Make sure that isolated VLAN is plumbed to your ESXi server. You could clone copies of the PUB, IMP, CUC, etc. and change the VM properties to put the virtual NIC in this isolated VLAN. You would just start them up and they’d run with the exact same IP addresses as your production system, they’d just be on their own VLAN that couldn’t talk to anything other than devices that were on this isolated VLAN. You’d then put your management workstation on this VLAN and any phones you wanted to test on the VLAN as well. That way you could do test upgrades, etc and not affect your production system. It should be pretty easy to do actually.

      Likewise if they were paranoid about doing this, you could get any Ethernet switch (e.g. Catalyst 3560, 3750, 2960, etc.), an ESXi host, your workstation and phones and even physically isolate everything on a totally separate network.

      • Thanks Mike.
        Yes, from a high level we are going to clone off the current CUCM VM’s to our lab. we will simply isolate the virtual interface and give it no gateway Give it’s own switch, a small VM for DHCP, a couple phones and then upgrade and clone back, shutdown current 9.x, bring up 10.x and test.

        Then I will order the upgrade CUC, CM, UCCX, ER. Our Presence servers are part of GE’s global initatives. Therefore, I just trunk back to them for Presence for Jabber.

        Thanks,
        Tom

  10. I just did a 2 node BE6K this weekend with CCX from 9 to 10. Didn’t have disk space issues. (30 hours of work – not including pre-work) at least skeleton users on site, noticed no outages as we failed back and forward between nodes.
    V3 keys was an issue on CM, so I just put it on all servers. No disk space issues.
    CCX refresh upgrade COP requires a minimum of 9.0.2SU1 and we were on 9.0.2 – had to patch (su) to patch(cop) to upgrade!
    CM sub upgrade kept failing (MD5 from memory), TAC said to roll back pub to 9 and then upgrade worked (swear that historically you’d switch version and not an issue)
    IM/P file from e-delivery had the word bootable_ at front of filename, needed to delete that for it to recognise as valid.
    IM/P 9 have to upgrade pub first and then sub (10 should be concurrent), first upgrade failed as switch version on CM had been done but I hadn’t selected for IM/P. Either set upgrade of IM/P to switch version to 10 or roll back CM.
    8945 handsets locked up with “Upgrading” message, even settings button wouldn’t work. Required intermediate firmware with new keys, and then came good.

  11. doing cucm, elm and cuc from 9.1 to 10.5 this weekend. uccx will have a minor upgrade. in addition to what you wrote, i will add a few more steps like changing the network adapter type. and red hat version. device defaults will not be changed.

  12. Mike,
    Thanks for the article. I’m planning an upgrade from 9.1 to 10.5. I’ve already installed the updates ciscocm.version3-keys.cop.sgn, ciscocm.free_common_space_v1.3.k3.cop.sgn, ciscocm.vmware-disk-size-reallocation-1.0.cop.sgn, and increased the vm disk from 80 to 110 GB. From what I understand, I also need to get a new license in advance. How do I go about getting a new license? Sorry if that’s a dumb question, but the extent of my CUCM knowledge is how to add/update phones.
    Thanks

    • Hi Steve,

      You’ll definitely need new 10.x licenses to install on the Prime License Manager aka PLM or ELM that your CUCM, UCXN, and CUP (Jabber) server.

      You get these licenses by having a valid SWSS or UCSS/ESW contract with Cisco. You have the contract associated with your CCO userid and use the Product Upgrade Tool (PUT) to request the 10.x media. The preemptively open a case with TAC to request the new licenses.

  13. kindly advise the issue below.
    We’ve CUCM and CUC with version 9.1(2) installed two years ago for 100 standard UWL licenses on BE6K, we wanted to upgrade them to 10.5 and purchased the licenses below. pls confirm if we’ve the correct licenses for the upgrade.
    L-CUWL-MISC Unified Workplace Licensing – Top Level – Misc Addon Only Pcs 1
    L-REINST-UWL-STD Reinstate CUWL Standard – 1 user Pcs 100
    L-REIN-UWL-STD-RTU Reinstate CUWL Standard RTU Pcs 1
    L-LIC-UWL-STD1 Services Mapping SKU, Under 1K UWL STD users Pcs 100

  14. Thanks Mike! Had to resize vDisk on CUCM and ran across your post. Extremely helpful. Do you have a Cisco document with this information? I did this in the lab and everything worked. Just curious where you found this info.

    I remembered that from CUCM 10.0 on, the OVA template had increased the disk size from 80GB to 110GB. It turns out that if you increase the size of the VM’s disk in 10.0 or greater, CUCM will automatically see the space and take advantage of it. CUCM 9.x doesn’t do this automatically.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s