Category: Performance



A standard tool of the *nix world is the top utility. The vmware world as esxtop.

This utility can offer insight as to how your system is running. My first introduction was when I thought a performance problem was related to disk issues.

The utility runs from the host. You will need to ssh to host to run it.

If you are using a windows, putty has the ability to be an ssh client. After you ssh into the host, simple enter :

esxtop

The display is similar to *nix top. it’s a good idea to run it and explore to see what it can do. It’s also good to get an idea of what your host is doing.

~ # esxtop
4:41:40am up 35 days 12:21, 224 worlds; CPU load average: 0.04, 0.04, 0.04
PCPU USED(%): 4.6 4.0 3.6 3.7 6.6 6.2 2.4 2.8 5.6 8.4 4.3 8.3 AVG: 5.0
PCPU UTIL(%): 4.9 4.4 4.0 4.1 7.0 6.5 2.8 3.2 5.9 8.7 4.6 8.5 AVG: 5.4

ID    GID NAME             NWLD   %USED    %RUN    %SYS   %WAIT    %RDY
1      1 idle               12 1151.31 1200.00    0.01    0.00 1200.00
1065893 1065893 win7-vm4        5   12.86   13.16    0.00  491.88    0.23
1107501 1107501 win7-vm8        5   16.42   16.37    0.03  483.32    0.23
1065893 1065893 win7-vm4        5    4.10    4.04    0.05  495.65    0.23
1188816 1188816  win7-vm10       5    3.61    3.57    0.03  496.07    0.28
288571 288571 win7-vm3        5    3.59    3.54    0.02  496.19    0.19
1107487 1107487 win7-vm7        5    3.53    3.49    0.01  496.20    0.23
877747 877747 win7-vm2        5    3.47    3.46    0.00  496.19    0.27
1218138 1218138 win7-vm11       5    3.41    3.38    0.00  496.28    0.26
1636794 1636794 esxtop.5132848      1    0.62    0.62    0.00   99.36    0.00
887    887 hostd.5077         17    0.21    0.21    0.00 1699.43    0.10
2      2 system              8    0.11    0.11    0.00  799.77    0.01
17     17 vmkapimod           9    0.05    0.05    0.00  899.81    0.00
1177   1177 vmware-usbarbit     2    0.03    0.03    0.00  199.92    0.01
1636790 1636790 dropbearmulti.5     1    0.01    0.01    0.00   99.97    0.00
7      7 helper             58    0.01    0.01    0.00 5799.12    0.01
1585   1585 sfcb-ProviderMa     4    0.01    0.01    0.00  399.92    0.00
610    610 vmkiscsid.4794      2    0.01    0.01    0.00  199.95    0.00
8      8 drivers            10    0.01    0.01    0.00  999.84    0.00

I wanted to see what storage was doing so I simply pressed “d” to see.

ADAPTR PATH                 NPTH   CMDS/s  READS/s WRITES/s MBREAD/s MBWRTN/s
vmhba0 –                       1     0.00     0.00     0.00     0.00     0.00
vmhba1 –                       2     0.40     0.00     0.40     0.00     0.00
vmhba32 –                       0     0.00     0.00     0.00     0.00     0.00
vmhba33 –                       0     0.00     0.00     0.00     0.00     0.00
vmhba34 –                       0     0.00     0.00     0.00     0.00     0.00
vmhba35 –                       0     0.00     0.00     0.00     0.00     0.00
vmhba36 –                       0     0.00     0.00     0.00     0.00     0.00
vmhba37 –                       0     0.00     0.00     0.00     0.00     0.00
vmhba38 –                       0     0.00     0.00     0.00     0.00     0.00

Not much what I expected so I moved on to other areas.

The vmware knowledge base has an article which goes into greater detail.

Jumbo Frames and Virtual Machines


One of the things I have been meaning to try with virtual machines are jumbo frames. It would be interesting to see what they do for performance but my networks at work are not ready for them. They will be eventually, but not at the moment. This post is simply to log information when we are ready to configure the network to use them.

NOTE: The post will be a work in progress. I started it with the intention of setting them up on my network but as with any job, schedules change and this will be done at a later time. For now it’s meant to hold information I have found and offer something to discuss.  I will update it it from time to time.

Why even use Jumbo Frames?

The simplest explanation is it will send less frames across a network which in turn means less CPU cycles. When it’s time to send data; a unit is assembled and it will involve having it’s TCP/IP headers read by the sending and receiving devices. Each device touched along the way will read the frame and will add to the frames and packets to get them to the final receiver. This process consumes cpu cycles and bandwidth.

Using jumbo frames means less frames are sent across the network. A single jumbo frame is 9 kilobytes.  The default frame is 1.5 kilobytes. A jumbo frame will eliminate the need for six default frames. Less frames means less CPU cycles and bandwidth.

To fill a gigabit Ethernet pipe, you would need over 80000 default standard frames per second. With jumbo frames; you only need 14000 or more. This is a significant reduction of CPU cycles and overhead. As such network performance increases.

What is needed for Jumbo Frames

Before you attempt to enable them; you have to verify your network is able to handle them. First and foremost, you have to be running gigabit Ethernet. NICs and switches have to support them as well. You will have to do work or coordinate the effort as you simply can’t enable them on your hosts and VMs while the Network switches are not configured to use them.

Another issue is how the vendor configured them. Are they the same size? Are all the devices in the middle using the same size? As you can read, this is not a task to take lightly.

Things to consider:

  • You need gigabit Ethernet.
  • Measure your devices on the Network as your frames will be sized by the device with the smallest capacity.
  • Fragmentation will only happen at Layer 3.
  • Layer 2 switches will not fragment frames. They either forward or will drop them.
  • Size adjustments will only happen at TCP level.
  • Know your applications. Low latency applications can be hurt by jumbo frames.
  • Know your devices. Jumbo frames will require computing power.

Check for jumbo frame capability

I don’t have access to my network devices. My network people report they are all capable. I would verify this first because if you switches can’t handle jumbo frames, your whole effort ends as it means new equipment and network gear can be rather expensive.

Virtual Machine Host

Your virual machines may be able to run them but the host has to as well.

VMware ESXI:

The host NIC can only be checked by using the SSH connection to the host.

login as: root
root@host's password:

You have activated Tech Support Mode.
The time and date of this activation have been sent to the system logs.

VMware offers supported, powerful system administration tools.  Please
see www.vmware.com/go/sysadmintools for details.

Tech Support Mode may be disabled by an administrative user.
Please consult the ESXi Configuration Guide for additional
important information.

~ # esxcfg-vswitch -l
Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
vSwitch0         64          9           64                1500    vmnic0,vmnic1

  PortGroup Name        VLAN ID  Used Ports  Uplinks
  VM Network            0        4           vmnic0,vmnic1
  VMkernel              0        1           vmnic0,vmnic1
  Management Network    0        1           vmnic0,vmnic1
~ #

As you can see the MTU is set to the basic size of 1500. To increase the size you can do the following:

esxcfg-vswitch -m 9000 vSwitch0

This will set the MTU to 9000. Verify the change by issuing:

~ # esxcfg-vswitch -l

 

XenServer:

Not had a chance to set this yet.

 

Setting Jumbo Frames on a VM

The virtual machine will need to have jumbo frames set. They have a default of 1500 MTU.

I do have an question of if this should be set to jumbo frames as well. Could it be the host being set will serve the virtual machines best? I do not know at this point as I have not been able to test.

Windows 7

The resolution is to set the default MTU of 1500. You can use the netsh utility to change this. First, lets see what exists by entering:

netsh interface ipv4 show subinterfaces

Of course there is the assumption you have not enabled ipv6. The output will look something like:

C:\>netsh interface ipv4 show subinterfaces

   MTU  MediaSenseState   Bytes In  Bytes Out  Interface
------  ---------------  ---------  ---------  -------------
4294967295                1          0   26497571  Loopback Pseudo-Interface 1
  1500                5          0     854768  Local Area Connection 2

If jumbo frames are set on your switches and host, you can set them on the host (using the example above) by entering:

netsh interface ipv4 set subinterface "Local Area Connection 2" mtu=9000 store=persistent

You can verify the change by using the show subinterfaces command again.

If all is in order, your network connection will still be available.

TEST THE SETUP!

Jumbo frames are not something to enable and “walk away.” You should test your setup and verify they work and they are giving you the desired increase in performance. If they are not working, you may have to disable them and return to the standard MTU of 1500 at the VM level.

Information Links

A nice small explanation of Jumbo Frames

VMWare entry for enabling Jumbo Frames on Host

Disable TCP Chimney offload


When it comes to the virutalization of Windows 7, there are considerations for performance. There are options which are enabled by default which makes sense for laptops and even desktops. A VM, however, may not benefit and in fact may see performance loss.

One such feature is TCP Chimney offload. TCP Chimney offload is designed to free up CPU utilization and increase network throughput by moving TCP processing tasks to hardware. This frees up the server’s CPU for other tasks. This makes sense if consider the available processors, ram and types of tasks running. Chances are a VM is only performing a certain task and might even be network bound when considering how many VM’s are running on the host.

If your VM has access to fast processors and enough RAM, you might see an increase in performance by disabling it.

You can verify if it’s running by using the netsh utility

C:\>netsh int tcp show global
 Querying active state...
TCP Global Parameters
 ----------------------------------------------
 Receive-Side Scaling State          : enabled
 Chimney Offload State               : automatic
 NetDMA State                        : enabled
 Direct Cache Acess (DCA)            : disabled
 Receive Window Auto-Tuning Level    : disabled
 Add-On Congestion Control Provider  : none
 ECN Capability                      : disabled
 RFC 1323 Timestamps                 : disabled

To disable it; you would enter:

C:\> netsh int tcp set global chimney=disable

You can verify the change by using the show global command again:

C:\>netsh int tcp show global
 Querying active state...
TCP Global Parameters
 ----------------------------------------------
 Receive-Side Scaling State          : enabled
 Chimney Offload State               : disabled
 NetDMA State                        : enabled
 Direct Cache Acess (DCA)            : enabled
 Receive Window Auto-Tuning Level    : disabled
 Add-On Congestion Control Provider  : ctcp
 ECN Capability                      : disabled
 RFC 1323 Timestamps                 : disabled

Monitor the change and see if you have an improvement.  If you don’t and want to change it back, simply enter:

C:\> netsh int tcp set global chimney=enable

As mentioned it’s default enabled but you may not need it.

If you would like to read more about it the following link will give you more information.  Granted it mentions windows 2008 but the information is still valid.

TCP Chimney and 2008

Windows 7 and Compound TCP


TCP/IP has expienced many changes due to it’s growth in use over the years. Some features help with Networking while posing a hindrance in performance to the virtual machine.

One feature that is disabled for Windows 7 is “Compound TCP.” CTCP is more agressive with the TCP send window. It attempts to increase throughput by monitoring delay and packet loss.

You might see an increase in performance if you enable this feature.

As always test a change like this. Run a test job a couple times to get an estimated run time.

You will need administrative level access and the netsh utility is used to make this change:

netsh int tcp set global congestionprovider=ctcp

If you find it doesn’t do anything, you can go back to the system default with the following command:

netsh int tcp set global congestionprovider=default

So far ctcp has helped and in a couple cases it’s contributed to significantly reduce run times.

Disable Features in Windows 7


The easiest way to improve performance is to free up resources and disable features which are not needed. There are many things installed on Windows 7 which are not really needed on a virtual machine running in a lab or data center. A user oriented virtual machine might need stuff so define the purpose of the virtual machine before disabling features.

Items to consider:

  • Games – Internet Games – This is a no brainer.  Much as we like games; they serve no purpose and take up resources and might take away resources.  Disable them if they are not disabled.
  • Indexing Service – There really isn’t a need to run this in a lab or data center VM, disable it.
  • Internet Explorer – This is a tough one. Security might improve with it’s elimination. Might also preserve resources as it eliminates a user from hitting the browser because it’s available. However, you might need it for debug purposes. I left mine in place for now.
  • Internet Information Services – FTP Server – If you find this running and don’t need it, disable it.
  • Internet Information Services Hostable Web Core – If you find this running and don’t need it, disable it.
  • Media Features – This is often overlooked. It’s a pretty good chance this is not needed so disable it.
  • Microsoft .NET Framework 3.51 – This one could be disabled if it turns out you don’t need it. A lab or data center VM might so check on it’s need before you disable it.
  • Services for NFS (Network File System) – This is not a normal install, check with the owners before disabling it.
  • Subsystem for UNIX-based Applications – See above.
  • Windows Gadget Platform – This is good for users. Probably not needed for a lab or a data center. Disable it.
  • Windows Search – The search feature has mixed views. Some like it; some hate it. In the case of a VM, disable it.
  • XPS Services – This feature is good for the users. Probably not needed for a back room VM.
  • XPS Viewer – Might or might not be needed. Some reporting systems use it.

To disable OS features.

  1. Start the Programs and Features Control Panel
  2. Off to the left; you will see Turn Windows Features on or off
  3. Select and bring up the list of options
  4. Deselect the options you want to disable.
  5. Select OK and wait for the uninstall process to finish.
  6. Reboot the system when you get a chance.

Performance for virtual machines can be improved through many changes. An often overlooked consideration is networking.

Network slowdown can occur if your router doesn’t support TCP Window Scaling. This is easy to check.

1) You will need a command window with administrative access. Either through logon privilege or starting it with administrative access.

Netsh is a command-line scripting utility which will allow you to modify the network configuration while it’s running. This will help you decide if sliding window is in use.

2) Enter: netsh interface tcp show global

 

 

 

 

 

Look at Receive Window Auto-Tuning Level. This value can be changed. There are a few options. If you don’t know what to use, start with disabled. You can change this later on. There are five choices in all:

  • disabled – Set the receive window to default.
  • highlyrestricted – Set the receive window to grow beyond default but only for certain issues.
  • restricted – Set the receive window to grow even more.
  • normal – Set the receive window to grow at most scenerios
  • experimental – Set the receive window to grow for extreme scenerios.

3) Enter: netsh interface tcp set global autotuninglevel=disabled

4) Reboot the VM and see if it improves.  If not try a new setting.


An often overlooked aspect of virtualization is performance. Many times people setup a host; virtualize a couple machines and walk away. You know you are guilty of it. Maybe not all the time but it happens and we usually go back when the users complain about the performance.

As we have become more power conscious; one of the ways to conserve power was to create processors with a core parking ability which is basically consolidating production into the fewest amount of cores and then shutting down non-used cores. In many cases, this is not an issue as it centers on threaded applications. If your applications are not threaded, then you won’t gain much.  However, in the matters of virtualization, this could help with making use of cores. I observed a few of the cores at 100% while others were not in use.  After disabling cpu parking; I noticed the workload was spread out.

This is what I did to disable cpu parking on my Windows 7 VMs.

You will need regedit to disable a few keys.   There will be one for each power plan on the system so remember to use the F3 key to continue the search after you update the key.

You will want to search for the key :  0cc5b647-c1df-4637-891a-dec35c318583

Once the search presents the key, look for a value called  :   ValueMax

This is what tells they computer that all cores can be parked.

Change the value to :  0

You will need to continue searching through the registry for other instances. On average, you will only need to change two entries.

After you are finished and exit regedit, give the VM a reboot.