Search This Blog

Wednesday, June 16, 2010

Intermittent ping timeout on Hyper V Cluster running on Dell R710

For those who has work with Dell R710 before, you will realise that it came with Integrated Quad Port Broadcom 5709C NIC. If you have face the similar problem as what i did in my project, please read below workaround.

Scenario:-
1. Perform network adapter teaming using Broadcom Advanced Control Suite (BACS) on Broadcom 5709C NIC card.

Problem:-
Intermittent ping result when running countious ping (ping X.X.X.X -t) to High Availability Virtual Machine which running in Hyper V Cluster.

You will encounter "request timeout" when perform live migration between 2 nodes and sometime the connection is up. Well, this is not good when running in production environment.

Steps taken to troubleshoot
1. Remove existing Virtual Switch and recreate Virtual Switch.
2. Upgrade driver provided by Dell Website and Broadcom
3. Refresh setting using Failover Cluster
4. Upgrade with latest patches for the operating system
5. Disable the following features on NIC card and teaming adapter
  • Receive Side Scaling (RSS Offload)
  • TCP Offload (IPV4 and IPV6)
  • IP Checksum Offload
6. Apply related hotfixes such as KB974909 (for Win2k8 R2) and KB 981836 (for Win2k3 VM)


Still having the same problem until i try the workaround below...

Workaround
Final step, remove the adapter from network adapter teaming and create a virtual switch. Once remove from the teaming, the connection to virtual machine has stable and no intermittent ping timeout.

Repeat the final step on 3 different sites which configure with 2 nodes cluster. Problem solved.

Quite strange but another standalone server Dell 610 is working fine with network adapter teaming.

For those folk who has encounter the same problem on network adapter teaming, Microsoft recommend to remove the teaming feature and perform test. Here is the guideline about Microsoft Support Policy for NIC teaming with Hyper V - Click here

11 comments:

  1. Depending on the team type, the Broadcom software replaces the MAC address Hyper-V assigns to a VM with its own teamed MAC address. Obviously that won't follow the machine across nodes. To test, ping your VM from another machine on the same subnet, and then look at the machine's ARP table -- it will show the host's MAC, not the guests'.

    ReplyDelete
  2. Tested ping from another machine on the same subnet. Having same issue. Arp is showing host MAC as well. On the switch fdb table showing same result. :(

    ReplyDelete
  3. Just found this similar issue as reported :- http://social.technet.microsoft.com/Forums/en/winserverhyperv/thread/c4223e6f-65c3-4c59-aa6b-5fb70f0e5abf

    ReplyDelete
  4. Anyone know as fix for this ? I have same issue with R710 and latest broadcom/dell driver/BACS ?

    tried all possible config, always something wrong, I'm really thinking about removing teaming..

    ReplyDelete
  5. Hi,

    Removing teaming is the options for me. Unless you got an Intel NIC which you can do teaming with Broadcom NIC.

    ReplyDelete
  6. Well, that's what I have done, It's almost amazing that broadcom driver can't work as expected. After the systematic blue screen (resolved disabling TCP offloading) when I had more than one team, now the ARP problem and intermittent lost connectivity.
    Broadcom device are not new, nor Hyper-V I lost severals days on my project just because of buggy drivers...and they still are...
    Definitively, buy Intel Nic !!!!!

    ReplyDelete
  7. Hi everybody
    Juste had a response from broadcom, using latest basp.sys driver (1.3.23) (the one included info BACS) from broadcom web site all is working like a charm.
    Just one thing to know, I had also lots of problem on my switch config, because each nic of my team (SLB with no failover) connected to a different switch, I needed to disable spanning-tree on theese specific switch ports.
    For now, our hyper-v rocks... thanks to everybody

    ReplyDelete
  8. Cool. Found a nice write oout about this issue. http://www.confusedamused.com/notebook/broadcom-nic-teaming-and-hyper-v-on-server-2008-r2/

    ReplyDelete
  9. Here is a great walk-though I made up of how we made this work on a ibm 3650 M3 and a HS22 and the rest of our environment. If you want to trunk the port, create one managed vlan and use this for host access. Then create a Broadcom untagged vlan, set it up to allow trunking on all vlans through scvmm and set the tagging on the hyper-v side assigning the untagged vlan to the guest. The mac address no longer stay on the host when it moves and you can change the ip back and fourth with no hitch. We tested this between a few different types and it works. Enjoy.

    http://cid-d4bc1bf3c9a3f101.office.live.com/edit.aspx/Hyper-V

    ReplyDelete