Resolving ESXi 7.0 NIC connection issues on Supermicro X10SDV-4C-TLN2F motherboards
I have a few Supermicro X10SDV-4C-TLN2F motherboards that I’ve been setting up to run VMware ESXi on. The first board was fine, but when setting up another I could never get ESXi to acquire an IP over DHCP. The hardware was happy the connection was valid, and both Debian 11 and Windows Server 2019 were fine …but not ESXi:
I’d been using a slightly older version of ESXi (7.0U3D/19482537), so I tried a newer version (7.0U3g/20328353) that I’ve upgraded to on one of the other boards, but this didn’t help. Both of my boards were running the same BIOS version (2.1) and build date (2019-11-22), so that seemed an unlikely cause.
After more head-scratching, I came across Workaround and fix for intermittent Intel X552/X557 10GbE/1GbE network link-down outages on Xeon D-1500 Series. I didn’t have the exact boards they were talking about in the start of the post nor the intermittent issues mentioned, but the one I have does have an Intel X552 so I thought this was promising. After going through the whole post (and most of the more recent comments), it seemed like it might be a NIC firmware difference between the two of my boards. I’m using a basic gigabit switch too, and not a 10GbE one.
Investigating the current NIC & firmware
Using the console, we can find out more about the hardware. When connected directly (or using IPMI), Alt+F1 will switch to the console.
$ esxcli network nic list Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description ------ ------------ ------- ------------ ----------- ----- ------ ----------------- ---- ----------- vmnic0 0000:03:00.0 ixgben Up Down 0 Half ac:1f:6b:12:f8:6e 1500 Intel(R) Ethernet Connection X552/X557-AT 10GBASE-T vmnic1 0000:03:00.1 ixgben Up Down 0 Half ac:1f:6b:12:f8:6f 1500 Intel(R) Ethernet Connection X552/X557-AT 10GBASE-T $ esxcli network nic get -n vmnic0 Advertised Auto Negotiation: true Advertised Link Modes: Auto, 1000BaseT/Full, 10000BaseT/Full Auto Negotiation: false Cable Type: Twisted Pair Current Message Level: -1 Driver Info: Bus Info: 0000:03:00:0 Driver: ixgben Firmware Version: 0x800005ad Version: 18.104.22.168 Link Detected: true Link Status: Up Name: vmnice0 PHYAddress: 0 Pause Autonegotiate: true Pause RX: true Pause TX: true Supported Ports: TP Supports Auto Megotiation: true Supports Pause: true Supports Hakeon: false Transceiver: Virtual Address: 00:50:56:56:62:7 Hakeon: None
On the board with issues, the firmware version is
0x800005ad. But the working
0x800003e7, which is curious.
A Temporary Solution
I then tried the suggestion from Bruno Zeidan, who had the same board:
esxcli network nic set --speed 1000 --duplex full -n vmnic0
This worked — and continues to do so after a reboot. Unfortunately, if the network cable is unplugged it won’t come back up again.
Upgrading the BIOS
My next thought was to try the new BIOS release. Supermicro put out 2.3 on 2021-06-04 and upgrading the BIOS is enough of a faff I didn’t bother previously.
I wrote a FreeDOS 1.3 LiveCD using Rufus to a USB drive and dropped in the BIOS Zip contents. I’d originally tried to use the FreeDOS bundled with Rufus, but this gave me an out of memory error (like mentioned in this blog post).
After booting back into ESXi after flashing completed, it did immediately get
an IP, which was encouraging. Post upgrade, the firmware version was
0x800005ad still and unplugging the cable ended up with the same situation as
Trying to flash the NIC
Knowing that a difference between NIC firmware versions was the most likely explanation for the different behaviour on each board, I was curious to see what would happen if I tried to flash them. I also wasn’t convinced about the hex values of the version number — Intel didn’t have those listed anywhere.
I got a copy of the most recent driver CD and moved the Intel NIC
BOOTUTIL over to the FreeDOS USB drive from earlier. Calvin Bui’s post about
flashing the NIC firmware enabled me to figure out what was needed. From
reading around, it seems like Supermicro likely have their own firmware
specifically for the onboard NIC, so I stuck to the available (two)
On the problematic board, it turned out that the version was
C:\BOOTUTIL\DOS\bootutil.exe Intel(R) Ethernet Flash Firmware Utility BootUtil version 22.214.171.124 Copyright (C) 2003-2017 Intel Corporation Type BootUtil -? for help Port Network Address Location Series WOL Flash Firmware Version ==== =============== ======== ======= === ================== ======= 1 AC1F6B12F86E 3:00.0 10GbE YES UEFI,PXE Enabled 2.3.58 1 AC1F6B12F86F 3:00.0 10GbE YES UEFI,PXE Enabled 2.3.58
But the working board was
C:\BOOTUTIL\DOS\bootutil.exe Intel(R) Ethernet Flash Firmware Utility BootUtil version 126.96.36.199 Copyright (C) 2003-2017 Intel Corporation Type BootUtil -? for help Port Network Address Location Series WOL Flash Firmware Version ==== =============== ======== ======= === ================== ======= 1 OCC47A95BE18 3:00.0 10GbE YES UEFI,PXE Enabled 2.3.53 1 OCC47A95BE19 3:00.0 10GbE YES UEFI,PXE Enabled 2.3.53
NVMUpdate is the tool to flash the NIC firmware and the easiest way to run
this was to create a Debian Live USB drive and copy over the versions
from the driver ISO images.
Once you’re booted into Debian, you can:
- Find the drive:
sudo fdisk -l,
- Create a mount point:
sudo mkdir /media/usb-drive,
- Then mount it, e.g.:
sudo mount /dev/sdb1 /media/usb-drive)
./nvmupdate64e would always give a device access error on all
available Supermicro versions (and the most recent Intel version available
from their site):
$ sudo ./nvmupdate64e Intel(R) Ethernet NVM Update Tool NVMUpdate version 188.8.131.52 Copyright(C) 2013 - 2023 Intel Corporation. WARNING: To avoid damage to your device, do not stop the update or reboot or poweroff the system during this update. Inventory in process. Please wait [...|******] Num Description Ver.(hex) DevId S:B Status === ================================== ============ ===== ====== ============== 01) Intel(R) Ethernet Connection N/A(N/A) 15AD 00:003 Access error X552/X557-AT 10GBASE-T Tool execution completed with the following status: Device not found. Press any key to exit.
Now I was stumped. The next best option is to get in touch with Supermicro support, but I’d love to hear from anyone who gets further than me!
For now, I’m using the temporary fix mentioned above as I need to get on and get it setup.