For a customer I placed a SQL server, which has a disk size of about 900 MB, on “cheap” iSCSI SAN with a dual 10 GB link. This was not the only Virtual Machine on the iSCSI hardware there where about 13 LUN defined with each about 6 virtual machines. I “doubted” if” this machine was a candidate for this particular iSCSI solution but measurements of the environment showed me that it was possible.
It seems to went well but after a few days I got complaining users and i saw this graph (see below) ! It seems that something has changed on January the 17th. From that day the SQL server is bean heavily loaded with import jobs that will run every night. Because the customer want to have the latest production data for testing and training the virtual machine characteristic changed dramatically du to this adjustment!
As you can see the write latency of that LUN went sky high ( a average above 200 ms) because of the jobs import on the SQL server . So it was time to move this virtual machine back to FC storage. The machine could not be placed off and the move from FC to iSCSI took about 2 hours. But when I moved the machine back it took 10 hours and 4 minutes 5 times longer. That is the longest life vMotion I have ever experienced! The machine was heavily used during the vMotion.
The SQL server is now running from FC storage with the same workload as the on iSCSI when it went wrong and even during the storage vMotion it did not reached the 20 ms. The average write latency went from above 200 ms to below 3 ms!
An average latency below 10 ms is very good most of the times. (why most of the times you should always listen to the users to see what they are experiencing). When you get a average between 10 and 20 ms you will like look in to that because is can be an problem. Above 20 ms you will most likely experience problems! In this case the SQL server was almost not responding and users where complaining. So to prevent a situation like this know what you’re systems are doing and what the changes are been made. A harmless import job can have a huge impact!
The screen shots are from the free monitor form Veeam monitoring