Search This Blog

Wednesday, June 16, 2010

Intermittent ping timeout on Hyper V Cluster running on Dell R710

For those who has work with Dell R710 before, you will realise that it came with Integrated Quad Port Broadcom 5709C NIC. If you have face the similar problem as what i did in my project, please read below workaround.

Scenario:-
1. Perform network adapter teaming using Broadcom Advanced Control Suite (BACS) on Broadcom 5709C NIC card.

Problem:-
Intermittent ping result when running countious ping (ping X.X.X.X -t) to High Availability Virtual Machine which running in Hyper V Cluster.

You will encounter "request timeout" when perform live migration between 2 nodes and sometime the connection is up. Well, this is not good when running in production environment.

Steps taken to troubleshoot
1. Remove existing Virtual Switch and recreate Virtual Switch.
2. Upgrade driver provided by Dell Website and Broadcom
3. Refresh setting using Failover Cluster
4. Upgrade with latest patches for the operating system
5. Disable the following features on NIC card and teaming adapter
  • Receive Side Scaling (RSS Offload)
  • TCP Offload (IPV4 and IPV6)
  • IP Checksum Offload
6. Apply related hotfixes such as KB974909 (for Win2k8 R2) and KB 981836 (for Win2k3 VM)


Still having the same problem until i try the workaround below...

Workaround
Final step, remove the adapter from network adapter teaming and create a virtual switch. Once remove from the teaming, the connection to virtual machine has stable and no intermittent ping timeout.

Repeat the final step on 3 different sites which configure with 2 nodes cluster. Problem solved.

Quite strange but another standalone server Dell 610 is working fine with network adapter teaming.

For those folk who has encounter the same problem on network adapter teaming, Microsoft recommend to remove the teaming feature and perform test. Here is the guideline about Microsoft Support Policy for NIC teaming with Hyper V - Click here