VMware Flash Read Cache (VFRC) Performance. Does it really make a difference?

I’ve been generally disappointed with the performance of my MS S2D storage array due to the way MS lays writes down onto parity pools. All the SSD caching and NVMe write cache just wasn’t doing enough and it was time to go a different route. I’ve written about my Datrium install a few times now, and for those who are unfamiliar with the solution the short version is that Datrium takes host populated flash and builds a dedicated Read / Write pool on the host. It does an extremely good job of this and if you would like to know more check out their website here https://www.datrium.com/open-convergence-architecture I like that I am able to populate my host with “cheap[” SSD for performance, and then dump the data to cheaper spinning disk for cold storage.

So how can I do this at home without spending a bunch of money? My first though was Dell’s Cachecade which I have on the PERC H700 I used for this test. The problem with Cachecade however is that it has a hard limitation of 512GB per host, reads above 64kb are not cached, and there is no support for Write cache. I then thought about VSAN, and using the vFlash caching in that product, however given my lab and limited resources, VSAN would be a no-go as well.

Finally I came to VFRC. Like cachecade it is only a read side cache, however unlike Cachecade it will cache all reads, it can generate up to a 32TB pool with a maximum of 400GB of cache per VMDK. The downsides of VFRC are that you have to enable it per VMDK, and you have to set the cache block size to the average of your workload’s IO size otherwise you will degrade the performance of the cache. That said it is extremely easy to get setup and running, and it is included in your ESXi Enterprise Plus licensing, so you don’t get hit with additional cost for VSAN licensing (though that said, VSAN cache tiering would provide a significantly better performance experience).

Let’s look at the test VM. The server is a Dell R510 w/ 2x Xeon E5640 CPUs, 64GB of RAM, and the PERC H700 1GB BBWC card. I have 3 WD RE4 2TB drives + 1 Samsung 850 Pro 256GB SSD. The server is running ESXi 6.5 U1, and the VM is a 4 vCPU 6GB vRAM PVSCSI controller VM.

First up let’s see what this machine does with a VM running directly on the OCZ RD400 NVMe AIC. This is a Thin provisioned .vmdk living on a VMFS v6 volume.

Screen Shot 2017-08-20 at 7.18.32 PM

This test shows that a 8k 70/30 Random Read/Write test against a 5GB file with 30 minutes of sustained IO generated 165MBps Throughput, 21,193 IOPS, and averaged 0.3ms of read latency. This is fantastic and exactly what I would expect of an NVMe disk.

So now let’s run exactly the same test against my VFRC volume. This is a RAID 5 of 3x SATA6 2TB Western Digital RE4 drives + 50GB of VFRC at an 8k block size.

Screen Shot 2017-08-20 at 7.19.31 PM

For a SATA RAID5 that’s not bad. 3.3MBps, 429 IOPS, and an average read latency of 18ms. This isn’t amazing performance by any means, but again for a 3 disk SATA array this is pretty fantastic.

Let’s compare that to a disk with no vFlash Cache.

Screen Shot 2017-08-20 at 7.19.45 PM

1.87MBps, 239 IOPS, and 33ms average latencyu. The VFRC disk is literally 2x the performance of the traditional disk running on the same array.

So as you can see VFRC is a great option for those who have traditional storage, either local or SAN, where a flash tier would normally not be available.

Finally, let’s look at the same workload on my Datrium cluster.

Screen Shot 2017-08-20 at 7.20.24 PM

As you can see, this is not going to be a replacement for having dedicated flash resources, and Datrium caches much more efficiently than VMware does, however if you need a quick “Cheap” fix, VFRC is a great answer.

Advertisements

Microsoft Exchange: Powershell Script to add a SMTP Address to all users.

Recently during an Exchange to Office 365 Migration a bunch of the mailboxes were not part of the default SMTP address policy, and therefore didn’t get the “@domain.mail.onmicrosoft.com”

Instead of trying to update them all manually I just wrote a quick script to add the address with their default alias just so we could get them moved to Office 365

 

#################################################
#   Add new SMTP address to all mailboxes.      #
#   Created by - Cameron Joyce                  #
#   Last Modified - Jun 04 2017                 #
#################################################
# This script will add a new SMTP address for each mailbox on a specificed Exchange server.
# This script must be run from the Exchange Management Shell for Exchange 2010 - 2016

$mailboxes = Get-Mailbox -server servername

Foreach ($mailbox in $mailboxes){
    $name = $mailbox.alias
    Set-Mailbox "$name" -EmailAddresses @{add="$name@domain.com"}
}

VMware vSphere: A general system error occurred: Connection refused When starting Virtual Machines.

I had an issue the other day with starting a VM. It would DRS successfully, however fail with “A general system error occurred: Connection refused”

Screen Shot 2017-06-02 at 1.10.34 PM

Googling tells me that the culprit is the vmware-vpx-workflow service being stopped. I SSHed into my VCSA and sure enough found that the service was indeed stopped.

Screen Shot 2017-06-02 at 1.15.32 PM.png

So I attempt starting the service, and that failed.

Screen Shot 2017-06-02 at 1.15.43 PM.png

What the hell? Doing a tail on all the logs in the /var/log/vmware/workflow folder don’t come up with anything. However after re-reading the errors during start I realized…maybe its a disk space issue.

Screen Shot 2017-06-02 at 1.16.17 PM.png

Sure enough, our log disk was full. I grew the log disk, and ran the autogrow command in VMware to resize the disks in the VCSA, restarted the services and VOLA!

Screen Shot 2017-06-02 at 1.16.28 PM.png

After updating the disk size, we were all set and I was able to start VMs without issues.

VMware Paravirtual SCSI adapter: Is it really that much faster?

I asked the same question myself after reading a best practice guide from Datrium that suggested using the VMware PVSCSI controller instead of the default recommendation of the LSI SAS controller that VMware makes when you create a Windows VM.

Out of curiosity I spun up a new server 2016 VM. 4 Cores 8GB of RAM, and a 100GB drive, hosted on my Datrium storage to find out how much of a difference there was.

For this test I ran during a normal production workload, and used Microsoft DiskSpd with a 16k IO size (my current average for my app servers) to test to see what we would get for results. The specific command I used was

diskspd.exe -b16K -d1800 -h -L -o2 -t4 -r -w50 -c10G C:\io.dat

The first run on the VMware LSI SAS controller resulted in this.

Command Line: C:\Users\cjoyce_admin\Downloads\Diskspd-v2.0.17\amd64fre\diskspd.exe -b16K -d1800 -h -L -o2 -t4 -r -w50 -c10G c:\io.dat

Input parameters:

timespan: 1
-------------
duration: 1800s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'c:\io.dat'
think time: 0ms
burst size: 0
software cache disabled
hardware write cache disabled, writethrough on
performing mix test (read/write ratio: 50/50)
block size: 16384
using random I/O (alignment: 16384)
number of outstanding I/O operations: 2
thread stride size: 0
threads per file: 4
using I/O Completion Ports
IO priority: normal

Results for timespan 1:
*******************************************************************************

actual test time: 1800.00s
thread count: 4
proc count: 4

CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 8.35%| 1.84%| 6.50%| 91.65%
1| 8.38%| 1.89%| 6.48%| 91.62%
2| 7.78%| 1.79%| 5.99%| 92.22%
3| 7.39%| 1.60%| 5.79%| 92.61%
-------------------------------------------
avg.| 7.97%| 1.78%| 6.19%| 92.03%

Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 15150776320 | 924730 | 8.03 | 513.74 | 3.888 | 3.175 | c:\io.dat (10240MB)
1 | 15089106944 | 920966 | 7.99 | 511.65 | 3.904 | 3.289 | c:\io.dat (10240MB)
2 | 15108947968 | 922177 | 8.00 | 512.32 | 3.899 | 3.140 | c:\io.dat (10240MB)
3 | 15109013504 | 922181 | 8.01 | 512.32 | 3.898 | 3.086 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 60457844736 | 3690054 | 32.03 | 2050.03 | 3.897 | 3.173

Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 7574110208 | 462287 | 4.01 | 256.83 | 3.274 | 2.741 | c:\io.dat (10240MB)
1 | 7539032064 | 460146 | 3.99 | 255.64 | 3.297 | 2.966 | c:\io.dat (10240MB)
2 | 7562526720 | 461580 | 4.01 | 256.43 | 3.297 | 2.861 | c:\io.dat (10240MB)
3 | 7543046144 | 460391 | 4.00 | 255.77 | 3.293 | 2.613 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 30218715136 | 1844404 | 16.01 | 1024.67 | 3.290 | 2.798

Write IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 7576666112 | 462443 | 4.01 | 256.91 | 4.501 | 3.448 | c:\io.dat (10240MB)
1 | 7550074880 | 460820 | 4.00 | 256.01 | 4.510 | 3.479 | c:\io.dat (10240MB)
2 | 7546421248 | 460597 | 4.00 | 255.89 | 4.501 | 3.289 | c:\io.dat (10240MB)
3 | 7565967360 | 461790 | 4.01 | 256.55 | 4.503 | 3.389 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 30239129600 | 1845650 | 16.02 | 1025.36 | 4.504 | 3.402
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.000 | 0.000 | 0.000
25th | 1.360 | 2.258 | 1.709
50th | 2.818 | 3.885 | 3.269
75th | 4.481 | 6.093 | 5.443
90th | 6.259 | 8.370 | 7.195
95th | 7.163 | 9.928 | 8.987
99th | 10.090 | 13.425 | 12.593
3-nines | 23.523 | 30.284 | 27.785
4-nines | 47.191 | 52.535 | 49.878
5-nines | 190.339 | 161.402 | 190.339
6-nines | 534.581 | 534.289 | 534.289
7-nines | 545.593 | 535.040 | 545.593
8-nines | 545.593 | 535.040 | 545.593
9-nines | 545.593 | 535.040 | 545.593
max | 545.593 | 535.040 | 545.593

Overall not terrible. Now lets look at what we get when we replace the LSI SAS with a PVSCSI.

Input parameters:

timespan: 1
-------------
duration: 1800s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'c:\io.dat'
think time: 0ms
burst size: 0
software cache disabled
hardware write cache disabled, writethrough on
performing mix test (read/write ratio: 50/50)
block size: 16384
using random I/O (alignment: 16384)
number of outstanding I/O operations: 2
thread stride size: 0
threads per file: 4
using I/O Completion Ports
IO priority: normal

Results for timespan 1:
*******************************************************************************

actual test time: 1800.00s
thread count: 4
proc count: 4

CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 7.37%| 1.53%| 5.84%| 92.63%
1| 7.02%| 1.40%| 5.62%| 92.98%
2| 6.35%| 1.25%| 5.10%| 93.65%
3| 6.04%| 1.22%| 4.82%| 93.96%
-------------------------------------------
avg.| 6.70%| 1.35%| 5.35%| 93.30%

Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 15667019776 | 956239 | 8.30 | 531.24 | 3.760 | 2.938 | c:\io.dat (10240MB)
1 | 15743369216 | 960899 | 8.34 | 533.83 | 3.741 | 3.011 | c:\io.dat (10240MB)
2 | 15789637632 | 963723 | 8.37 | 535.40 | 3.730 | 2.841 | c:\io.dat (10240MB)
3 | 15788425216 | 963649 | 8.36 | 535.36 | 3.731 | 2.914 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 62988451840 | 3844510 | 33.37 | 2135.84 | 3.740 | 2.926

Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 7831814144 | 478016 | 4.15 | 265.56 | 2.660 | 2.405 | c:\io.dat (10240MB)
1 | 7862943744 | 479916 | 4.17 | 266.62 | 2.640 | 2.538 | c:\io.dat (10240MB)
2 | 7904346112 | 482443 | 4.19 | 268.02 | 2.632 | 2.247 | c:\io.dat (10240MB)
3 | 7881277440 | 481035 | 4.18 | 267.24 | 2.631 | 2.557 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 31480381440 | 1921410 | 16.68 | 1067.45 | 2.641 | 2.440

Write IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 7835205632 | 478223 | 4.15 | 265.68 | 4.859 | 3.010 | c:\io.dat (10240MB)
1 | 7880425472 | 480983 | 4.18 | 267.21 | 4.840 | 3.045 | c:\io.dat (10240MB)
2 | 7885291520 | 481280 | 4.18 | 267.38 | 4.831 | 2.946 | c:\io.dat (10240MB)
3 | 7907147776 | 482614 | 4.19 | 268.12 | 4.827 | 2.833 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 31508070400 | 1923100 | 16.69 | 1068.39 | 4.839 | 2.959
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.000 | 0.000 | 0.000
25th | 1.189 | 2.947 | 1.810
50th | 1.868 | 4.126 | 3.120
75th | 3.536 | 6.037 | 4.971
90th | 5.392 | 8.026 | 6.924
95th | 6.269 | 9.628 | 8.417
99th | 9.446 | 13.234 | 12.021
3-nines | 22.655 | 32.422 | 28.825
4-nines | 45.679 | 50.249 | 48.554
5-nines | 158.326 | 159.371 | 159.371
6-nines | 475.470 | 427.329 | 427.329
7-nines | 475.711 | 427.338 | 475.711
8-nines | 475.711 | 427.338 | 475.711
9-nines | 475.711 | 427.338 | 475.711
max | 475.711 | 427.338 | 475.711

So overall we see roughly a 4% performance increase across the board. Not groundbreaking numbers, however if you’re trying to squeeze every last drop of performance out of your VMs this could be a big step in the right direction.

Speaking of squeezing every last drop, lets see what happens when we test against a ReFS formatted disk.

Command Line: C:\Users\cjoyce_admin\Downloads\Diskspd-v2.0.17\amd64fre\diskspd.exe -b16K -d1800 -h -L -o2 -t4
10G E:\io.dat

Input parameters:

timespan: 1
-------------
duration: 1800s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'E:\io.dat'
think time: 0ms
burst size: 0
software cache disabled
hardware write cache disabled, writethrough on
performing mix test (read/write ratio: 50/50)
block size: 16384
using random I/O (alignment: 16384)
number of outstanding I/O operations: 2
thread stride size: 0
threads per file: 4
using I/O Completion Ports
IO priority: normal

Results for timespan 1:
*******************************************************************************

actual test time: 1800.02s
thread count: 4
proc count: 4

CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 8.65%| 1.62%| 7.03%| 91.35%
1| 8.69%| 1.49%| 7.20%| 91.31%
2| 7.83%| 1.35%| 6.47%| 92.17%
3| 7.43%| 1.36%| 6.07%| 92.57%
-------------------------------------------
avg.| 8.15%| 1.46%| 6.69%| 91.85%

Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 18047041536 | 1101504 | 9.56 | 611.94 | 3.263 | 2.708 | E:\io.dat (10240MB)
1 | 18078842880 | 1103445 | 9.58 | 613.02 | 3.258 | 3.004 | E:\io.dat (10240MB)
2 | 18066751488 | 1102707 | 9.57 | 612.61 | 3.260 | 2.712 | E:\io.dat (10240MB)
3 | 18132910080 | 1106745 | 9.61 | 614.85 | 3.248 | 2.727 | E:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 72325545984 | 4414401 | 38.32 | 2452.42 | 3.257 | 2.791

Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 9020080128 | 550542 | 4.78 | 305.85 | 2.762 | 2.399 | E:\io.dat (10240MB)
1 | 9030025216 | 551149 | 4.78 | 306.19 | 2.760 | 2.927 | E:\io.dat (10240MB)
2 | 9041592320 | 551855 | 4.79 | 306.58 | 2.759 | 2.342 | E:\io.dat (10240MB)
3 | 9050865664 | 552421 | 4.80 | 306.90 | 2.752 | 2.479 | E:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 36142563328 | 2205967 | 19.15 | 1225.53 | 2.758 | 2.547

Write IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 9026961408 | 550962 | 4.78 | 306.09 | 3.764 | 2.899 | E:\io.dat (10240MB)
1 | 9048817664 | 552296 | 4.79 | 306.83 | 3.754 | 2.998 | E:\io.dat (10240MB)
2 | 9025159168 | 550852 | 4.78 | 306.03 | 3.762 | 2.954 | E:\io.dat (10240MB)
3 | 9082044416 | 554324 | 4.81 | 307.96 | 3.742 | 2.870 | E:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 36182982656 | 2208434 | 19.17 | 1226.90 | 3.756 | 2.931
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.267 | 0.297 | 0.267
25th | 1.252 | 1.773 | 1.403
50th | 2.019 | 3.097 | 2.618
75th | 3.724 | 5.038 | 4.275
90th | 5.581 | 6.998 | 6.240
95th | 6.395 | 8.584 | 7.525
99th | 9.641 | 12.213 | 11.021
3-nines | 20.505 | 26.232 | 23.305
4-nines | 42.971 | 45.559 | 44.280
5-nines | 238.498 | 175.573 | 204.921
6-nines | 502.382 | 359.149 | 435.862
7-nines | 547.128 | 547.124 | 547.128
8-nines | 547.128 | 547.124 | 547.128
9-nines | 547.128 | 547.124 | 547.128
max | 547.128 | 547.124 | 547.128

With a ReFS formatted disk on top of PVSCSI we see a 17% increase!

So if your applications support it, and you truly want to squeeze every last drop out of your storage, ReFS and PVSCSI is the combination to go with!

Windows Server Backup using Powershell.

I needed a script to be able to do an on demand backup of a windows server without installing 3rd party software on it. The idea is that physical or virtual, if I needed a quick backup of a box, including a log for auditing, I could have a click to run solution.

Here is that solution. It works for sure on Server 2012, 2012 R2, and 2016 Boxes. 2008 / R2 may need some additional tweaking.

#################################################################
#   Windows Server Backup                                       #
#   Created by - Cameron Joyce                                  #
#   Last Modified - May 2nd 2017                                #
#################################################################
# This script is used to do an on demand backup of a windows server (server 2008 or newer). 

# Variables
$date = (Get-Date).ToString('MM-dd-yyyy')
$time = (Get-Date).ToString('MM-dd-yyyy HH:mm:ss')
$hostname = $env:COMPUTERNAME
$backupserver = "your.server.fqdn"
$osversion = (Get-CimInstance Win32_OperatingSystem).version
$neterr = $false

# Setup folder and logfile.
If(Test-Connection -ComputerName $backupserver -count 1 -Quiet){
    # Try / Catch block for WMI errors. A client that passes Test-Connection may not have PSRemoting enabled and will error. This will handle that.
    Try{
        $ErrorActionPreference = "Stop"
        If(!(Test-Path "\\$backupserver\wsbackups\$hostname")){
            New-Item "\\$backupserver\wsbackups\$hostname" -Type Directory
        }
    }
    Catch [System.Management.Automation.Remoting.PSRemotingTransportException]{
        Write-Warning "Failed connection to backup server."
        $neterr = $true
        If($neterr -eq $true){
            Send-MailMessage -From "smtp@address.com" -To "rcpt@address.com" -Subject "$hostname failed scripted backup. Unable to connect to network storage." -Body "$hostname failed backup because it was unable to connect to '
            $backupserver. Please check the connections and try again." -SmtpServer "srvr.server.com"
        }
        Break
    }
    Finally{
        $ErrorActionPreference = "Continue"
    }
}

# Create Directories and logs.
If(!(Test-Path "\\$backupserver\wsbackups\$hostname\$Date")){
    New-Item "\\$backupserver\wsbackups\$hostname\$Date" -Type Directory
}
$logfile = "\\$backupserver\wsbackups\$hostname\$date\$hostname.$date.txt"

# Verify WSB is installed and load modules. If it is not installed, install it.
Import-Module ServerManager
$bkup = Get-WindowsFeature *backup*
# This if loop contains the commands for install on both 2008 - 2012 as well as server 2016.
If($bkup.InstallState -like "Available"){
    Write-Host "Installing windows server backup role."
    Write-Output "Installing windows server backup role." | Out-File $logfile -append
    If($osversion -like "6.3*" -or "10*"){
    Add-WindowsFeature -Name Windows-Server-Backup -Restart:$false
    }
    Else{Add-WindowsFeature -Name Backup-Features -IncludeAllSubFeature:$true -Restart:$false}
}
Else{
    Write-Host "Server backup is already installed."
    Write-Output "Server backup is already installed." | Out-File $logfile -append
}

# Execute Backup.
Write-Output "Starting Backup at $time" | Out-File $logfile -append
& cmd.exe /c "wbadmin start backup -backupTarget:\\$backupserver\wsbackups\$hostname\$date -allCritical -systemState -vssFull -quiet" | Out-File $logfile -Append
Write-Output "Backup completed at $time" | Out-File $logfile -append

# Look for backup errors.
$eventid4 = $false
$eventlist = Get-WinEvent -logname Microsoft-Windows-Backup | Where {$_.timecreated -gt (Get-Date).Addminutes(-5)} | Select Message
Foreach($line in $table){
    If($line -like "The backup operation has finished successfully."){
        $eventid4 = $true
    }
}

# Send success / failure email.
If($eventid4 -eq $true){
    Write-Host "Backup Success!"
    Send-MailMessage -From "$hostname@domain.com" -To "rcpt@domain.com" -Subject "$hostname has successfully backed up." -Body "Review attachment for backup log." -Attachments "$logfile" -SmtpServer "smtp.server.com"
}
ElseIf($eventid4 -eq $false){
    Write-Host "Backup Failed"
    Send-MailMessage -From "$hostname@domain.com" -To "rcpt@domain.com" -Subject "$hostname has failed backedup." -Body "Review attachment for backup log." -Attachments "$logfile" -SmtpServer "smtp.server.com"
}

PowerShell script to fix VSS errors.

We’ve all had vss writer issues during backups. And many of us have all used the MS technet article to re-register those VSS writers.

Well I had to do that today, and figured I would build a PS script to take care of that so I don’t have to go googling for that article in the future.

#################################################
#   Volume Snapshot Service Repair              #
#   Created by - Cameron Joyce                  #
#   Last Modified - Apr 27 2017                 #
#################################################
# This script is used to repair Microsoft VSS on servers that are failing backups.

# Set Location
sl "C:\windows\system32"

# Stop Services
If((Get-Service -name vss).Status -eq "Running"){
    Stop-Service -Name vss -force
    If(!((Get-Service -name vss).Status -eq "Stopped")){
        Write-Host = "VSS Service failed to stop. Stop manually and re-run script"
        Break
    }
}
If((Get-Service -name swprv).Status -eq "Running"){
    Stop-Service -Name swprv -force
    If(!((Get-Service -name vss).Status -eq "Stopped")){
        Write-Host = "Shadow Copy Provider Service failed to stop. Stop manually and re-run script"
        Break
    }
}

# Re-Register DLLs for VSS
regsvr32 /s ole32.dll
regsvr32 /s oleaut32.dll
regsvr32 /s vss_ps.dll
regsvr32 /s /i swprv.dll
regsvr32 /s /i eventcls.dll
regsvr32 /s es.dll
regsvr32 /s stdprov.dll
regsvr32 /s vssui.dll
regsvr32 /s msxml.dll
regsvr32 /s msxml3.dll
regsvr32 /s msxml4.dll
vssvc /register 

# Start Services
Start-Service vss
Start-Service swprv
If(!((Get-Service -name vss).Status -eq "Running")){
    Write-Host = "VSS Service failed to start. Start service manually."
}
If(!((Get-Service -name swprv).Status -eq "Running")){
    Write-Host = "Shadow Copy Provider Service failed to start. Start service manually."
}

How to fix “A general system error occurred: vim.fault.GenericVmConfigFault” When creating or Removing Snapshots in VMware.

Recently I had a bunch of virtual machines that started generating this error during Veeam backups. I hadn’t bothered to really be checking my snaps because my daily job was supposed to be taking care of that for me. Unfortunately, this bit me and Here we are

As many of you have probably experienced, Veeam doesn’t always clean up after itself when it is finished backing up a VM. Sometimes a file lock or other operation prevents the snap cleanup, and you end up with a huge chain of snapshots.

Normally you would just be able to right click the VM, select Snapshots, then “delete all” and be done, however I was getting.

A general system error occurred: vim.fault.GenericVmConfigFaultScreen Shot 2017-04-21 at 1.40.54 PM

So how do we fix this? Well it turns out there are two ways.

Resolution 1:

This resolution requires downtime, however is significantly faster. It does however have some caveats I will get to later.

  1. Shut down the Virtual Machine.
  2. SSH to the host where the VM was running.
  3. Change Directory to the volume where the guests disks are stored.
    cd /vmfs/volumes/volumguid/machinefolder
    
  4. Find all delta.vmdk for the VM
    ls -ltrh | grep delta.vmdk
    
  5. If it looks something like this you’re good to go. Notice how of the 5 snapshot deltas, only 0x5.delta.vmdk actually has data, the others are empty?
    Screen Shot 2017-04-21 at 1.48.54 PM
  6. If it looks like this, you’re going to have to use Resolution 2. Notice how multiple of the snapshot deltas hold data?
    Screen Shot 2017-04-21 at 1.51.07 PM
  7. Create a new folder in the VMs folder.
    mkdir ./tmp
    
  8. Move all of the EMPTY snapshot data to the /tmp folder. DO NOT DELETE IT. If the remaining steps do not work we have to restore these files.
    mv VM_1-000001* ./tmp
    
  9.  Once all Snapshots are moved, and you are left with your base disk and your last snapshot. Open VMware vSphere Client, and create a new VM. Make it identical to the old VM, however DO NOT add a disk. Remove the disk VMware wants to create, and select “add existing disk” Select the snapshot disk, NOT the base disk, and attach it to the VM. Do not select “Power On VM after Creation.”
  10. Once creation is complete you will notice that VMware will show “Virtual machine disks need consolidation.” Right click the VM and chose to consolidate the disk. Once consolidation is complete, boot the VM and verify functionality. You should be all set!

Resolution 2:

If your VM is not recoverable with Resolution 1, your alternate resolution is to use the VMware Standalone converter to convert a “Powered On Windows / Linux Machine.” So yes you are doing a P2V migration of a Virtual machine to a new virtual machine.

This isn’t the best solution in the world and its essentially the same as taking / restoring an OS based backup instead of a VM based one, however it allows your VM to be online during the entire operation which could be a consideration for those who have very small maintenance windows and may not be able to do a full consolidation in that window.