VSAN – cache disk unavailable when creating disk group on Dell

I ran into an issue at customer where the SSD which is to be used as the cache disk on the VSAN disk group was showing up as regular HDD.  However when I reviewed the storage device the disk is visible and is marked as flash…weird.  So what is going on here.

As I found out this due to a flash device being used with a controller that does not support JBOD.

To fix this I had to create a RAID 0 virtual disk for the SSD.  If you have a Dell controller this means you have to set the mode to RAID but make sure that all your regular HDDs to be used in the disk group is set to non-raid!  Once host is back online you have to go and mark the SSD drive as flash.  This is the little “F” icon in the disk devices.

This environment was configured with all the necessary VSAN prerequisites for Dell in place, you can review this on the following blog post:
http://virtualrealization.blogspot.com/2016/07/vsan-and-dell-poweredge-servers.html

Steps to setup RAID-0 on SSD through lifecycle controller:

  1. Lifecycle Controller
  2. System Setup
  3. Advanced hardware configuration
  4. device settings
  5. Select controller (PERC)
  6. Physical disk management
  7. Select SSD
  8. From drop down select “convert to Raid capable”
  9. Go back to home screen
  10. Select hardware configuration
  11. Configuration wizard
  12. Select RAID configuration
  13. Select controller
  14. Select Disk to convert from HBA to RAID (if required)
  15. Select RAID-0
  16. Select Physical disks (SSD in this case)
  17. Select Disk attribute and name Virtual Disk.
  18. Finish
  19. Reboot
After ESXi host is online again then you have to change the Disk to flash. This is due to RAID abstracting away most of the physical device characteristics and the media type as well.

  • Select ESXi host 
  • Manage -> Storage -> Storage adapters
  • Select vmhba0 from PERC controller
  • Select the SSD disk
  • Click on the “F” icon above.

VSAN – Changing Dell Controller from RAID to HBA mode

So had to recently make some changes for customer to set the PERC controller to HBA (non-raid), since previously it was configured with RAID mode and all disks was in RAID 0 virtual disks.  Each disk group consists of 5 disks with 1 x SSD and 4 x HDD.

I cannot overstate this but make sure you have all the firmware and drivers up to date which is provided in the HCL.

Here are some prerequisites for moving from RAID to HBA mode:  I am not going to get into details for performing these tasks.

  • All virtual disks must be removed or deleted.
  • Hot spare disks must be removed or re-purposed.
  • All foreign configurations must be cleared or removed.
  • All physical disks in a failed state, must be removed.
  • Any local security key associated with SEDs must be deleted.

I followed these steps:

  1. Put host into maintenance mode with full data migration. Have to select full data migration since we will be deleting the disk group.
    1. This process can be monitored in RVC using command vsan.resync_dashboard ~cluster
  2. Delete the VSAN disk group on the host in maintenance.
  3. Use the virtual console on iDRAC and select boot next time into lifecycle controller
  4. Reboot the host
  5. From LifeCycle Controller main menu
  6. System Setup
  7. Advanced hardware configuration
  8. Device Settings
  9. Select controller card
  10. Select Controller management
  11. Scroll down and select Advanced controller management
  12. Set Disk Cache for Non-RAID to Disable
  13. Set Non RAID Disk Mode to Enabled

VSAN upgrade – Dell Poweredge servers

I have been meaning to write up on a VSAN upgrade on a Dell R730xd’s with PERC H730 which I recently completed at a customer.  This is not going to be lengthy discussion on this topic but primarily want to provide some information on tasks I had to perform for upgrade to VSAN 6.2

  1. The VSAN on-disk metadata upgrade is equivalent to doing a SAN array firmware upgrade and therefore requires a good backup and recovery strategy to be in place before you proceed.
  2. Migrate VM’s off of host.
  3. Place host into maintenance mode.
    1. You want to use whatever the quickest method is to update the firmware, for VSAN’s sake. Normally Dell FTP update if network available to configure.
    2. When you put a host into maintenance mode and choose the option to “ensure accessibility”, it doesn’t migrate all the components off but just enough so that the policies will be in violation.  A timer starts when you power it off, and if the host isn’t back in the VSAN cluster after 60 minutes, it begins to rebuild that host’s data elsewhere in the cluster  If you know it will take longer than 60min or where possible select full data migration.
    3. You can view the resync using the RVC command “vsan.resync_dashboard “
  1. Change advanced settings required for PERC H730
    1. https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2144936
    2. esxcfg-advcfg -s 100000 /LSOM/diskIoTimeout
    3. esxcfg-advcfg -s 4 /LSOM/diskIoRetryFactor
  2. Upgrade the lsi_mr3 driver. VUM is easy!
  3. Login to DRAC and perform firmware upgrade:
  4. Upgrade Backplane expander (BP13G+EXP 0:1)
    1. Firmware version 1.09 ->  3.03
  5. Upgrade DRAC H730 version
      1. 25.3.0.0016 ->  25.4.0.0017
  1. Login to lifecycle controller and set/verify BIOS configuration settings for controller
    1. https://elgwhoppo.com/2015/08/27/how-to-configure-perc-h730-raid-cards-for-vmware-vsan/
    2. Disk cache for non-raid = disabled
    3. Bios mode = pause on errors
    4. Controller mode = HBA (non-raid)
  2. After all hosts upgraded, verify VSAN cluster functionality and other prerequisites:
    1. Verify no stranded objects on VSAN datastores by running python script on each host.
    2. Verify persistent log storage for VSAN trace files.
    3. Verify advanced settings still set from task 3!
  3. Place each host into maintenance mode again.
  4. Upgrade ESXi host to 6.0U2.
  5. Upgrade the on-disk format to V3.
    1. This task runs for a very long time and has alot of sub-steps which takes place in the background.  It also migrates the data off of each disk group to recreate as V3 .  This has not impact on the VMs.
    2. This process is repeated for all disk groups.
  6. Verify all disk groups upgrade to V3.
  7. Completed

Ran into some serious trouble and had a resync task that ran for over a week due to a VSAN 6.0 KB 2141386 which appears on  heavy utilization storage utilization.  Only way to fix this was to put host into maintenance mode with full data migration, destroy and recreate the disk group.

Also ALWAYS check the VMware HCL to make sure your firmware is compatible. I can never say this enough since it is super important.

This particular VSAN 6.0 was running with outdated firmware for both backplane and PERC H730. Also found that controller was set to RAID for disks in stead of non-raid (passthrough or HBA mode).

Links:

VMware as a kick@ass KB on best practices for Dell PERC H730 for VSAN implementation. Link  provide below.

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2109665

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2144614

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2144936


https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2141386

Dell rack servers – Upgrade firmware using Dell repository manager

I had to recently perform some firmware upgrades for customer on their Dell R710 and R730xd servers.  As you all know there are multiple ways to successfully upgrade the firmware and I want to touch on the upgrade through bootable virtual cd, optical cd or USB since this was the only method available to me at the time.

Firmware upgrade methods available:
  • Upgrade using bootable linux iso
  • Upgrade using server update utility(SSU) iso/folder with Dell lifecycle controller
  • Upgrade using Dell FTP Site with lifecycle controller
All of these methods have some great information out there on Dell website as well as blogs but I wanted to just go through my steps using bootable linux iso and primarily how to create it that ISO.
My preferred method is using the Dell FTP site with the lifecycle controller but this not always possible especially if you have trunked ports and have to specify a VLAN (in later iDRAC firmware it is now possible to specify a VLAN!)
The reason why the FTP site method is better in my opinion is because the firmware comparison is done upfront and only the necessary firmware is downloaded for component that are outdate. This decrease the firmware upgrade process considerably compared to the bootable iso that compares everything single component.(this only when you use the bundle, which I do in most instance since who wants to go manually through every single component and check which is required for your server:) 
Steps:
Firstly we need to create an ISO and this done using Dell repository manager.
Open dell repository manager (Data center version) , business client version for desktops

View job queues for plugins install and select each to perform confirmation needed and click accept! (only required after first install)
Create new repository

Select name

Select Dell online catalog

Select brand – poweredge rack
Select Linux

Select your Poweredge server

Click Next
Click Finish
Click Close

Check the box for bundle and select Create deployment tools (Other option is to select the components tab and select each individual component manually but this requires that you know exact which components you have installed on all your Dell servers)

Option 1:
Select Create server update utility (SUU) -> SUU to ISO, but remember to use this iso you have to mount this ISO through iDRAC virtual console as virtual CD, boot into lifecycle controller and select firmware upgrade specifying the CD
Option 2:
Select Create bootable ISO

Make your necessary selection and click Next

Select folders

Click Next
Click Ok
Review job queue for progress on file being created

"Boot from SAN" step by step with Windows 2012 R2 and Cisco UCS using Brocade and EMC VNX.

“Boot from SAN” step by step with Windows 2012 R2 and Cisco UCS using Brocade and EMC VNX.
UCS:
  • Create service profile for windows server.
  • Create “Boot from san” boot policy
    • Setup SAN primary and secondary target.
    • The WWN required are that of your VNX array ports.

Brocade:
  • Login to create an initial zone for one of the ports.
  • Create new Alias
    • Type in the Alias name and select the WWN from blade
  • Create zone
    • Select the blade Alias and VNX Storage processor
  • Add to Zone configurations
  • Activate

VNX:
  • Start EMC Unisphere
  • Create Initiator
    • WWN/IQN can be obtained from UCS director
      • Open properties window for service profile of server
      • Select storage tab
      • At top copy the World Wide Node Name  (this is the first part of WWN/IQN)
      • Under vHBAs copy the WWPN
    • Now combine the WWNN and WWPN and with “:” as separator paste into WWN/IQN
  • Select “New Host” radio button
    • Type in the server name and IP address
  • Create LUNs
  • Create Storage Group per server
    • Associate the hosts
    • Associate the LUNs
Server:
  • Start the server and boot from Windows disk
  • Load the UCS disk drivers when asked for during installation and selection of the installation disk.
  • Verify disks show up and select where it will be installed.
  • After installation is completed and Windows is up and running, go ahead and install EMC Powerpath!

Cisco UCS – step by step configuration

As mentioned I don’t go into too much details on my post since i think there are a lot of other great blogs and vendor documentation out there.  Here is my short bullet point task list.   If I am missing anything please let me know.
Set equipment policies:
  • Equipment tab -> equipment – > policies tab
    • Chassis/fex discovery policy
      • Action = 4 ports
      • Link grouping preference = port channel
    • Power policy = grid
Configure server/uplink port:
  • Equipment tab -> select FI-A/B -> expand -> fixed modules
    • Configure the appropriate unconfigured ports as “Server” (connections between IOM and Fabric Interconnect) and “Uplink” (connection to network)
Configure FC storage ports
  • Equipment tab
  • All the way bottom, select FI A
    • Right hand side select Configure unified ports
    • Run slider to end of fiber storage ports you need
    • This will reboot FIA, after reboot re-login.
  • Select FI B
    • Perform same steps
Create Port Channels:
  • Setup ports as uplink ports
  • LAN TAB
    • Fabric – Port Channels
    • Setup port channel ..set same PORT ID on both Fis
  • SAN TAB ( will not be creating port channel due to connection to Brocade)
    • San Cloud -> Fabric A -> Under general tab select “create Port Channel”
Create VSANs: (brocade):
  • SAN > SAN Cloud > Fabric A > VSANs (both Fabric A & B)
    • Create VSAN
    • Select the specific Fabric A or B (not common)!
  • Assign VSAN to FC uplinks
    • Equipment tab -> Fabric interconnect A & B -> Fixed modules -> FC ports
      • Select FC port
      • Under general tab click drop down for VSAN.
        • Select VSAN which is associated to FI.
Upgrade firmware
  • . An “*.A.bin” file and a “*.B.bin” file. The “*.B.bin” file contains all of the firmware for the B-Series blades. The “*.A.bin” file contains all the firmware for the Fabric Interconnects, I/O Modules and UCS Manage
  • Equipment tab -> Eqiupment -> Firmware management
  • Download firmware
  • Update firmware (view progress under Firmware auto install -> general tab, or press Apply to view status in same window))
    • Adapters
    • CIMC
    • IOMs
  • Activate firmware in the following order:  Choose “Ignore Compatibility Check” anywhere applicable.
    • Adapters
    • UCS manager
    • I/O Modules
    •  Choose “skip validation” anywhere applicable. Make sure to uncheck “Set startup version only”, since this is an initial setup and we aren’t concerned with rebooting running hosts
  • Activate subordinate FI and then primary FI
Create sub-organization
This is optional to create specific organizational servers/pools/policies for instance ESXi, SQL, Windows etc
  • Right click and root directory, select Create organization
  • Specify name
Create KVM IP pool:
  • Lab tab -> pools -> root -> IP Pools -> IP Pool ext-mgmt
  • Create block of IPv4 Addresses
    • Specify IP range
Create Server pool
  • Servers tab -> Pools -> Sub-Organization -> -> Server pools
  • Create server pool
Create UUID suffix pool
  • Servers tab -> Pools -> Sub-Organization -> -> UUID Suffix Pool
  • Create UUID suffix pool
  • Create Suffixes
Create MAC pool
  • For each suborganization create 2 groups of MAC pools. 1 for FI-A and 1 for FI-B
  • LAN TAB: -> Pools -> Root -> MAC Pools
    • Create new pool for A
    • Create block
    • Create new pool for B
Create HBA pools:
  • SAN TAB:
    • Pools -> root -> sub-organization -> WWNN Pools
      • Create WWNN pool
        • Add double the amount since each server will have two HBA’s
    • For WWPN we will again create separate pools for FI-A and FI-B:
      • Pools -> root -> sub-organization -> WWPN Pools
        • Create WWNN pool for FI-A
        • Create WWNN pool for FI-B

Create VLANS:
  • LAN TAB -> Lan -> Lan Cloud -> VLANs
    • Create new VLANs
    • Provide name and ID
Create vNICs templates:
  • LAB TAB -> LAN -> Policies -> root -> Sub-organization -> vNIC templates
    • Create vNIC template (this is again done for each FI-A and FI-B
Create VHBAs templates:

  • SAN TAB -> Policies -> root -> sub-organizations -> vHBA templates
    • Create vHBA Templates for both FI-A & FI-B

Create a Service Profile Templates:
Servers tab -> Servers -> Service Profiles -> root -> Sub-organizations
  • Create service profile template
Under networking select expert.
Click Add
Select Use vNIC template
Storage, select Local storage SD card policy
Select WWNN assignment policy
Select Expert connectivity
Create vHBA
Next zoning, leave defaults since we using Brocades
Set PCI ORDER
Select vMedia to use, default
Server boot order, select boot policy create for SDCard

Select Maintenance policy create earlier
Select server assignment
Operational Policies
Set Bios policy
Deploy service profile from template
Servers tab -> Service profile template -> root -> sub-organizations
Right click server profile template and select “create service profiles from template”
Select naming prefix
Configure call home:
Admin tab -> Communication Management -> call home
Turn on and fill in the requirements
In profiles tab add “callhome@cisco.com” to Profile CiscoTAC-1
Under call home policies add the following to provide a good baseline
Configure NTP:
Admin tab -> Time zone management
Add NPT servers
Backup configuration:
Admin tab -> ALL -> Backup configuration on right hand side pane
Select “create backup operations
Admin state = enables

Select location = local file system



For setting policies i created another blog:

Cisco UCS – configure policies

Set Policies:
Network control policies (enable CDP)
  • LAB tab -> Policies -> root -> sub-organizations -> network control policies
    • Create network control policy
    • Enable CDP
Bios Policy:
  • Servers tab -> Policies -> root -> sub-organizations -> Bios Policies
  • Create bios policies
    • Mostly setting cpu settings
Host Firmware:
  • Servers tab -> Policies -> root -> sub-organizations -> Host Firmware Packages
  • Create host firmware package
    • Set simple and only blade package version.
Local disk configuration:
  • Servers tab -> Policies -> root -> sub-organizations -> Local disk config policies
    • Create local disk configuration policy
      • This is to setup SD card
        • Disable protect configuration
        • Enable flexflash state
        • Enable flexflash RAID reporting state
      • For SAN boot
        • Set mode to No local storage
Maintenance policy:
  • Servers tab -> Policies -> root -> sub-organizations -> maintenance policies
    • Create Maintenance Policy
Boot policy:

  • Servers tab -> Policies -> root -> sub-organizations -> boot policies
    • Create boot policy
      • Expand local devices and add to boot order
        • Start with Local CD, then remote virtual drive then SD card

vCloud director – running MAC OS X and Windows VM in same vApp.

Recently we installed in Mac Pro 6,1 hardware and provided MAC OS X virtual machines to our vCloud director environment.

This was configured in a separate cluster in vCenter server which provides all the regular capabilities like HA, DRS and vMotion which is great. Also created a separate storage cluster and assigned a new MAC storage profile to this cluster which was made available within vCD.

MAC OS X templates was created and added to catalog with storage set to pre-provision on default storage profile create for MAC cluster.

Problem: 

In vCloud director a new Provider VDC was created and linked to the new vCenter server cluster.
Within the existing Organizations we created an additional virtual datacenter with MAC provider VDC selected.  This created new resource pool in the cluster.

The users where now able to deploy MAC OS templates to this VDC, however a request came back quickly that users need to deploy both MAC OS X and Windows VM within the same vApp.

Troubleshooting: 

The configuration as explained above obviously does not allow for this situation since during deployment of vApp you can only select a single VDC to deploy too as well for adding a VM you can only specify a single storage Policy, which will also be the one assigned to vApp’s VDC .  So all the VMs would need to run on the Apple hardware cluster which is not idea.

Solution:

Solution was pretty simple and can be accomplished by merging your provider VDC’s which was introduced in vCloud director 5.1.1.

  1. Login to vCloud director as system admin.
  2. Select Manage & Monitor
  3. Under Cloud resources select Provider VDCs
  4. Right click the MAC provider VDC and select Merge
  5. Select the Provider VDC that you want this merge with.
  6. After completion you will now see your Provider VDC has  additional resource pool, datastores and ORG VDCs
  7. I then went ahead and deleted the VDC I initially created for the MAC deployments since only need the original VDC.
Now when you deploy a vApp select the existing VDC which contains the MAC Provider VDC and storage profile.  Same can be accomplish for deploying VM within a vApp.

Mac Pro 6,1 rack environment running VMware ESXi 5.5 (with Fiber connectivity to VNX)

With the recently addition of Mac Pro 6,1 to VMware’s hardware compatibility I was eager to replace our exiting old Power Mac G5 towers in our environment.
Prerequisites:

  1. Mac Pro bios version MP61.88Z.0116.B05.1402141115
If your Mac Pro has an older boot ROM then just upgrade the Mac Pro to Yosemite (OS X 10.10) which contains the update to be applied to the Mac Pro.
  1. vSphere 5.5 P03 is required
Currently the latest version of ESXi available on VMware download is only 5.5 update 2 so you have to include the required patch version onto the update 2 ISO.  To do this perform the following steps:


  • Download the latest ESXi 5.5 Update 2 Driver rollup
  •  Download the offline bundle for ESXi Update 2 patch 3
  • Next you need to convert the offline bundle zip file into an ISO file to be placed on bootable USB stick.  To do this I used the VMware Image Builder which is available as part of PowerCLI.
    • After you installed PowerCli open the application
    • Change to folder location where zip file resides
    • Run command to add the offline bundle: 
      • ‘add-esxsoftwaredepot .\ESXi550-201412001.zip’
    • Run command to see the image profile: 
      • ‘get-esximageprofile’ 
    • Select the ESXi-5.5.0-20141204001-standard which include VMware tools and security patches.
      • Run Command:  
        • ‘New-EsxImageProfile -CloneProfile “ESXi-5.5.0-20141204001-standard” -name “ESXi55u2-P03-MACPRO” -Vendor MACPRO66’
    • Now you can create the ISO file with running command:
      • ‘Export-EsxImageProfile -ImageProfile “ESXi55u2-p03-MACPRO” -ExportToISO -filepath H:\VMware-ESXi-5.5u2-P03-MACPRO.iso’
    • This file can now be places on a bootable USB.
      • I use Universal-USB-installer or UnetBootin to place the ISO on the USB.
  • Boot ISO from MAC
    • press and hold the “ALT” key on keyboard to boot the USB. 
  • Rest is the same basic installation as with any regular Intel based server
Here is a list of all the hardware items used to in our environment:
  • Sonnet xMac Pro rackmount enclosure.
    • This the most valuable piece of equipment and I highly recommend this if you planning on placing your Mac Pro’s in a server rack. 
    • Comes with 3 x PCIe slots available through thunderbolt which provide 20Gbps throughput and flexibility which is unmatched and can now add extra network and even Fiber connections for storage.
    • Do yourself the favor and check them out:
    • UPDATE:
  • APPLE Mac Pro 6,1  
    • 12GB memory
    • Intel Xeon CPU E5-2697 v2 @ 2.70GHz
    • only purchase small memory size to be replaced with Transcend
  • Transcend 128GB Kit for Mac Pro
  • Intel I350-T4
    • 4 port network card.  We actually have two card installed in Sonnet.
    • This card is VMware compatible but not Sonnet however it works great without issues
  • 1 x Atto Celerity FC-81EN Fiber Channel adapter
  • APC AP7731
      • Since there are no dual power supplies on Mac Pro we purchased this APC switched rack PDU which takes two single-phase 220V drops and can switch power if you have a failure on one of over This provides redundancy even though you only have one cable.  However if the power adapter fails on hardware you are out of luck. 
Some gotchas experienced:
– We tried to run the updates for ESXi through VMware update manager and this caused the onboard NICs on Mac Pro to not be recognized anymore.  Re-installed the old version to resolve this.  Current build is 2302651
– To add storage on VNX a rescan does not seem to work so we had to restart the Mac Pro in order to pick up the LUNs.
– We initially installed all the PCI cards and then installed ESXi.  This cause the network cards numbering to go out of whack.  What we had to do was remove all the cards and power on ESXi and let it complete the startup.  Then shutdown and add a single PCI card  and power on again, Do this one at a time in order you want starting from bottom.    Should fix the network port order.
UPDATE:  created a new blog to show how within vCloud Director to run a MAC OS X and Windows VMs within the same vAPP

http://virtualrealization.blogspot.com/2015/04/vcloud-director-running-mac-os-x-and.html

Here are some photos of our build:

Started installation

ESXi installed and ready for use

Internal and external casing

Internal casing housing the Mac Pro

work bench

rear of Sonnett chassis. very nice

Mac Pro housing

More Mac Pro housing

Mac Pro ready to be installed

Now for some pictures in the rack:
APC PDU

Rear of Rack

Front of Rack..so nice and clean!

Cisco UCS error: will_boot_fault sensor failure asserted

After replacing a faulty UCS blade the following error presented itself after inserting the new blade into chassis:  will_boot_fault sensor failure asserted
Troubleshooting:
Ran the following –
Tried to run the board controller activation
# scope server
# scope boardcontroller
# show version
Showed version 13 (2.2.3d)
# show image
Did not display version 13 but latest was version 8 (2.2.1d)
# activate firmware .0 force
# commit-buffer
Received an error message that commit cannot proceed.
Solution:
The problem turned out to be the new blade had a newer version of firmware installed that what I had loaded in UCS manager. 
To fix this upload the latest firmware version ( in our case 2.2.3d).
Verify the new version is available by running show image under scope for chassis/server/boardcontroller.

  • Run the same process as listed above