1、部署完服务器后,在服务器上将LLDP打开,想通过LLDP去排查服务器与交换机的网线、光纤有没有连接错误,但是发现服务器的电口网卡(Intel X700系列网卡)无法正常显示LLDP邻居,就怀疑是网卡配置的问题。
[root@BCONEST-X86-MON02 ~]# lspci |grep net 18:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] 18:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] 3d:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09) 3d:00.1 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09) 3d:00.2 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09) 3d:00.3 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09) 5f:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] 5f:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
2、为了进一步定位问题,我们在异常接口上去通过tcpdump去抓包只能抓到服务器往外发的LLDP报文,没有抓到交换机发下来的报文。然后检查交换机配置后在交换机上debug,发现交换机接口有LLDP报文的收发,所以进一步判断是服务器网卡处理的问题。
[root@BCONEST-X86-MON02 ~]# tcpdump -i enp61s0f1 |grep -i LLDP tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on enp61s0f1, link-type EN10MB (Ethernet), capture size 262144 bytes 11:27:38.357788 LLDP, length 262: BCONEST-X86-MON02 11:28:08.401968 LLDP, length 262: BCONEST-X86-MON02 11:28:38.445474 LLDP, length 262: BCONEST-X86-MON02 11:29:08.489210 LLDP, length 262: BCONEST-X86-MON02 11:29:38.533460 LLDP, length 262: BCONEST-X86-MON02 11:30:08.579707 LLDP, length 262: BCONEST-X86-MON02 11:30:38.624087 LLDP, length 262: BCONEST-X86-MON02 11:31:08.668239 LLDP, length 262: BCONEST-X86-MON02 11:31:38.712726 LLDP, length 262: BCONEST-X86-MON02
3、经过不懈的搜索,在Radhat知识库发现了问题的所在,Intel X710 series NICs (i40e) do not receive LLDP frames
Intel 700 series NICs run an LLDP agent in firmware that will process and “absorb” any LLDPDU frames received from the switch. The frames are therefore never visible to the OS.
Intel 700 系列网卡在固件中会运行一个LLDP agent,这个agent会处理所有从交换发出的LLDP报文,这样在操作系统层面就再也看不到这个报文了。
解决方案:
Radhat提供了两个解决方案
①当Kernel版本大于等于kernel-3.10.0-957.el7
,可以调用ethtool --set-priv-flags eth0 disable-fw-lldp on
通知网卡驱动关闭内置的LLDP agent。
ethtool --set-priv-flags <NIC name> disable-fw-lldp on ethtool --set-priv-flags <enp61s0f1> disable-fw-lldp on
②内核版本低或第一种方案不生效是可以通过该方法关闭,但是这种方法重启会失效。
echo "lldp stop" > /sys/kernel/debug/i40e/<pci bus address>/command echo "lldp stop" > /sys/kernel/debug/i40e/0000\:3d\:00.0/command #开启0口 echo "lldp stop" > /sys/kernel/debug/i40e/0000\:3d\:00.1/command #开启1口 for i in `find /sys/kernel/debug/i40e/ -name command`; do echo 'lldp stop'> $i; done #使用find、echo、for循环批量重定向“lldp stop”
4、检查lldp信息是否能正常显示。
[root@ZJNB-PSC-P10F2-SPOD3-PM-OS01-BCONEST-X86-MON02 ~]# echo "lldp stop" > /sys/kernel/debug/i40e/0000\:3d\:00.0/command [root@ZJNB-PSC-P10F2-SPOD3-PM-OS01-BCONEST-X86-MON02 ~]# lldptool -t -n -i enp61s0f1 Chassis ID TLV MAC: 00:01:7a:6a:02:15 Port ID TLV Ifname: gigabitethernet2/0/44 Time to Live TLV 120 Port Description TLV dT:[BCONEST-X86-MON02]-eno4-bond0-10.194.220.2 System Name TLV ZJNB-PSC-P10F2-POD3-M-JR-4320-3&4 System Description TLV MyPower (R) Operating System Software Copyright (C) 2020 Maipu Communication Technology Co.,Ltd.All Rights Reserved. System Capabilities TLV System capabilities: Bridge, Router Enabled capabilities: Bridge, Router Management Address TLV IPv4: 10.0.0.40 Ifindex: 4 Port VLAN ID TLV PVID: 1 Port and Protocol VLAN ID TLV PVID: 0, supported, not enabled VLAN Name TLV VID 1200: Name VLAN1200 MAC/PHY Configuration Status TLV Auto-negotiation supported and enabled PMD auto-negotiation capabilities: 0x009b MAU type: 1000 BaseTFD Power via MDI TLV Port class PD PSE MDI power not supported PSE pairs not controllable PSE Power pair: unkwown [0] Power class 1 Link Aggregation TLV Aggregation capable Currently not aggregated Aggregated Port ID: 0 Maximum Frame Size TLV 9216 End of LLDPDU TLV
参考资料:
Radhat知识库
牛