Our exchange server 2010 VM has RDMs for each database. Overnight the volumes filled up after commvault snapshots of database volumes which took the volume offline and stopped the VM.
Reading online it seems this problemalso occur when users try to vMotion a VM but that has not happened in my case.
The VM was rebooted by the Exchange admin but during start up the VM gave the following error message:
Virtual disk ‘Hard disk x’ is a mapped direct-access LUN that is not accessible
On vsphere client when selecting the Hard disk and selecting manage paths you get errors “there is no multipath configuration for this LUN”
Identify if the Raw Data Mapping (RDM) LUN signature vml.############ assigned to the VM no longer matches the physical LUN assigned to the ESXi Host.
- On VM record the physical LUN mapping vml.########
- On ESXi host where VM resides run command to find the device name entry naa.############ for the vml identifier of VM. The “Other UIDs:” displays the vml identifier.
- On each ESXi host within the cluster verify that the device name has the same vml entry as what is associate to virtual machine hard disk.
- ls -alh /vmfs/devices/disks
This was my first try without consulting online and did not want to remove the Mapped Raw LUN from the VM so to fix this I did the following:
- I disabled the problem LUN on the back-end storage (our case Netapp) and then on each host i re-scanned the storage so that the devices was not showing anymore.
- I enabled the problem LUN again and rescanned the hosts.
- Make sure the device is showing again.
- Verify that the datastore mapping file (vml.######) was identical on VM to that on each ESXi host in the cluster for the particular problem hard drive.
- Started up the VM and problem was resolved.
If this solution does not work for you, then i would recommend using solution 2 which is pretty much straight from the VMware KB 1016210
Identify the problem hard drive and record the following settings on virtual machine:
- Mapped Raw LUN’s – physical LUN and datastore mapping file location. (vml.############)
- SCSI ID
- Associated LUN ID
- to get this information you need to run a few commands on ESXi host where VM currently resides
- ls -alh /vmfs/devices/disks (this will give device indentifier from RDM vml address.
- esxcli storage core path list (shows device identifier with LUN information)
- Remove the RDM LUN
- Map the RDM LUN again within the VM. This will be done by snapdrive agent on server.
- Power on the VM.
5.1.2 Build 1068441 vmware-vcloud-director-5.1.2-1968441.bin
5.5.0 Build 1323688 vmware-vcloud-director-5.5.0-1323688.bin
I run a RHEL virtual machine with vcloud director installed, so before starting the upgrade I fully patch the RHEL environment.
vCloud director 5.5 release notes:
Pre upgrade checklist for potential edge gateway problems:
Upgrading vCloud director documentation from VMware which I recommend reading:
- As always, first off start with BACKUPS
Copy the downloaded bin file to vcloud director server. I place the file in /tmp folder.
Verify the MD5 checksum-value of file
chmod u+x to make it executable.
Stop the vcloud director services on server.
Run the file installation file by typing “./”
Respond to the upgrade prompts.
Once completed DO NO start the vcloud services, firstly upgrade the vcloud database.
Respond to upgrade prompts. I did receive an error here which i explain at bottom of this blog!
Once completed vCloud service should now be started.
From the vShield Manager Inventory panel, click Settings & Reports.
Click the Updates tab.
Click Upload Upgrade Bundle.
Click Browse and select the VMware-vShield-Manager-upgrade-bundle-maintenance-5.1.2-997359.tar.gz file.
Click Upload File.
Click Install to begin the upgrade process.
Click Confirm Install. The upgrade process reboots vShield Manager, so you might lose connectivity to the vShield Manager. None of the other vShield components are rebooted.
Verify the maintenance update has been applied
- Create snapshot of vCloud Director virtual machine.
- Stop the vcloud services
- backup the vCloud database. We have a SQL server were the database resides.
- Create another 3de party backup. In my case i made use of Commvault to take a snapshot backup of the VM as well.
After you have upgraded vShield manager, you must upgrade all vCenter servers and hosts before you upgrade the vShield Edge appliances that the upgraded vShield Manager manages.
To upgrade ESXi host to 5.5 as well as upgrade the vCloud agent, perform the following steps in conjunction with vCloud director:
- From vCloud Director right click host and select disable the host
- Right click same host and select “Redeploy all VMs
- On vCenter Server put the ESXi host into maintenance mode
- Attach host upgrade and patch baseline to ESXi server.
- Remediate host
- Once complete from vCloud director right click host and select “Upgrade host agent”
- Take host out of maintenance mode and wait for vSphere HA agent install to complete
- Within vCloud Director you can enable the host again.
To upgrade the vShield Edges:
- Login to vshield
- Select the datacenter for which you want to upgrade.
- Select network virtualization tab
- Select Edges
- Select the edge and click on actions -> Upgrade.
- I did find that after the edge upgrades the users was not able to get connection through the vCloud Edge gateway. To resolve this i redeployed the Edge gateway with following steps:
- Login to vCloud
- Select organization
- Select VDC
- Select Edge Gateway tab
- Right click edge gateway and select Re-Deploy.
- This will recreate the edge gateway but will not loose any settings configured on it.
Upgrade problems experienced:
During step 9 of database upgrade I received the following error message:
Error: Unable to update database statistics. Database user has insufficient privileges to update database statistics. To complete this step manually, run ‘EXEC sp_updatestats’ as the DBO or a member of the sysadmin fixed server role.
Fix: On the database server (SQL) provide the vCloud user account with sysadmin server role. Run the command as provided in error against the database.
Error: http error 500 after upgrade when opening the vCloud director login page
Fix: Add the text“login.jsp” to the end of the vcloud page URL so you could use a local login. Then disabled SSO under federation services within vCloud director if you are not using it. I my case we make use of windows authentication and not SSO.
VUM error during remediation of ESXi host upgrade from 5.1 to 5.5:
“vmware update manager 5.5 Cannot run upgrade script on host“
Debugging the problem:
Troubleshooting this problem led to a few discoveries online of users experiencing the same error message but different log entries regarding problem. This can be read in kb articles below:
I tried with the VMware rollup ESXi as well as the Dell provided iso and had the exact same problem.
After the upgrade fails, SSH to your ESXi host and look for the following entries in \var\log\vua.log
2013-12-29T19:10:51.102Z [FFE898C0 info ‘VUA’] VUA exiting
2013-12-29T19:10:51.104Z [FFE898C0 error ‘Default’] Alert:WARNING: This application is not using QuickExit(). The exit code will be set to 0.@ bora/vim/lib/vmacore/main/service.cpp:147
–> backtrace rip 1a8272a3 Vmacore::System::Stacktrace::CaptureFullWork(unsigned int)
–> backtrace rip 1a64e6e9 Vmacore::System::SystemFactoryImpl::CreateBacktrace(Vmacore::Ref&)
–> backtrace rip 1a5d082f Vmacore::Service::Alert(char const*, char const*, int)
–> backtrace rip 1a60f0e8 Vmacore::Service::AppImpl::Init(Vmacore::Service::Config*)::do_quick_exit::invoke()
–> backtrace rip 1ae68ed9 /lib/libc.so.6(exit+0xe9) [0x1ae68ed9]
–> backtrace rip 1ae52f04 /lib/libc.so.6(__libc_start_main+0xe4) [0x1ae52f04]
–> backtrace rip 0804e5e1 /usr/share/vua/vua [0x804e5e1]
I opened a case with VMware to get this resolved so if you experience the same problem I do recommend contacting VMware support to further assist. If you intend to use this fix, please do so at your own risk.
To fix this issue, manually remove the FDM agent on the host, reboot and retry the upgrade.
Note: Removing the host from a vSphere HA cluster also removes the agent. If the agent is not removed, you may have to manually remove the agent.
To manually remove the FDM agent from the host, run these commands:
cp /opt/vmware/uninstallers/VMware-fdm-uninstall.sh /tmp
chmod +x /tmp/VMware-fdm-uninstall.sh
Reboot the host
After the FDM agent is removed and host has been rebooted you can run the remediation again for host upgrade.
vCenter server: 5.1.0 Build 1063329
ESXi hosts: 5.1.0 Build 1117900
ESXi software file: VMware-VMvisor-Installer-5.5.0-1331820.x86_64-Dell_Customized_A01 (we have Dell servers so using the latest Dell provided 5.5 installer)
VCenter server: VMware-VIMSetup-all-5.5.0-1476387-20131201 (5.5.0b)
There are a lot of blogs and post and documentation on upgrading so i am not going to bore you with screenshots and detailed explanation but just give me short point summary of how my upgrade process:
Firstly I would recommend reading the VMware best practices kb as well as the installation/upgrade guide for VMware:
- As always, first off start with BACKUPS:
Check COMPATIBILITY for vCenter plug-ins on new versions:
- Create snapshot/backup of vCenter Server virtual machine.
- If available I recommend a third party backup/snapshot of vCenter for instance I made use of Commvault to a take a snapshot and store. If vCenter upgrade fails it is not easy to revert/recover the snapshots since cannot start vCenter server.
- Create SSO backup by selecting the following on server where SSO is installed “Programs -> VMware -> Generate vCenter Single Sign-On backup bundle”
- Create backups of databases for VCenter Server (VCDB), SSO (VCSSO) and Update Manager (VCUPDATE).
SSO – seems to be a lot of users having issues with SSO upgrade with a few pitfalls.
- Verify that all third party plugins are compatible with the new version of VMware.
- Warning 250000: Verify that your certificate for SSO is not expired, if so renew before upgrade. http://www.boche.net/blog/index.php/2013/11/12/single-sign-on-warning-25000/ (Very well written article on this warning and how to work through it) I am still making use of my firstly created self-signed certificates and everything went fine.
- The registry key I had to change from IP to FQDN in order to install otherwise you get a notification during the installation. Look at my other blog with images of upgrade to see this error message. KB 2060511 – Change registry for SSO “HKEY_LOCAL_MACHINE\Software\VMware, Inc.\VMware Infrastructure\SSOServer\FQDNIP” to be FQDN and not an IP address. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=206051
Copy vCenter server install ISO to server and attached with virtual iso application for instance “Daemon tools”. I do this because if server is rebooted then you have the ISO attached through vCenter server client or web client it will loose the connection.
Simple or custom component install?
My two cents: I am not a big fan on the simple install because I don’t know what is exactly happening during the upgrade process and if a problem appears it is difficult to figure out which component failed, whereas with custom install you know which component you are installing. Also if you have components installed on multiple servers you have to use the custom install.
We are going to perform below steps for vCenter Server upgrade – (the visual and text guidance in installer has been much improved by VMware)
1. Upgrade vCenter Single Sign-On
2. Upgrade vSphere Web Client
3. Upgrade vCenter Inventory Service
4. Upgrade vCenter Server
5. Upgrade vCenter Update manager
5. Upgrade vSphere client
6. Upgrade vCenter compatible plug-ins
7. Upgrade Distributed Switches
Upgrade problems experienced:
All of the upgrades went very smoothly without much interaction or problems except for the following –
Error: Update manager vCenter health status failed.
Fix: Change the service account for “VMware vSphere Update Manager Service” to same account as used for vCenter server service.
Error: Storage monitoring service – initialization failed error on health status
Fix: Change the service account for “VMware vSphere Profile-Driven Storage Service” to same account as used for vCenter server service. Restart service
UPGRADE ESXi: (error “Cannot run upgrade script on host”)
There are multiple methods to upgrade each ESXi host to new release which can be read here –
We make use of vSphere Update manage to upgrade ESXi hosts, so can only start on this once vCenter and update manager has been upgrade to 5.5:
Firstly import the new iso to ESXi images. (does not work while still on 5.1)
Give the iso a baseline name
Create new Baseline group called for instance “ESX host 5.5 upgrade”
Select “host upgrade” for host baseline.
I did run into errors with upgrading the ESXi host from 5.1 to 5.5 which is addressed and fixed in the follow up blog post.
Just got an interesting request from user regarding a datastore that ran out of space on a standalone ESXi host. This is a Dell R720 server with local attached storage and all drives filled.
Debugging the problem:
Upon further investigate i found they create a disk on VM with thick provisioning when they not even using half of the disk space within the vmdk.
However since ESX host is not attached to VCenter server we cannot perform a clone or migration from the console, what to do?
Using The Command Line (SSH)
If you don’t have any space available on current datastore like in my situation then you would have to add a temporary datastore. NFS easiest to quickly configure.
SSH to host and run the following command:
vmkfstools –i “<ThickDiskName.vmdk” -d thin “NewDiskName.vmdk”
After this is completed you have to remove the thick disk and add the newly create thin disk to the VM.
For removal options i would recommend selecting “remove from virtual machine” only and NOT yet permanently delete the disk just yet. Wait until you have new disk configured within the VM operating system and verify it works as expected.