Не секрет, что часть нагрузки по обработке сетевых данных может быть перенесена в сетевую карту. Это такие известные вещи как TCP Chimney Offload / Receive Side Scaling / Network Direct Memory Access
И там же TCP Segmentation Offload (TSO) / Large Receive Offload (LRO)
начать читать про них можно здесь:
* Information about the TCP Chimney Offload, Receive Side Scaling, and Network Direct Memory Access features in Windows Server 2008
https://support.microsoft.com/en-us/help/951037/information-about-the-tcp-chimney-offload-receive-side-scaling-and-net
* Poor network performance or high network latency on Windows virtual machines (2008925)
https://kb.vmware.com/s/article/2008925
* Understanding TCP Segmentation Offload (TSO) and Large Receive Offload (LRO) in a VMware environment (2055140)
https://kb.vmware.com/s/article/2055140
В MS server 2016 работа с сетями была улучшена:
Virtual machine multi queues (VMMQ) - теперь для одной ВМ можно сделать несколько аппаратных очередей, что улучшает быстродействие.
https://www.vmgu.ru/news/microsoft-windows-server-2016-hyper-v
Virtual Machine Multiple Queues (VMMQ), formerly known as Hardware vRSS, is a NIC offload technology that provides scalability for processing network traffic of a VPort in the host (root partition) of a virtualized node. In essence, VMMQ extends the native RSS feature to the VPorts that are associated with the physical function (PF) of a NIC including the default VPort.
VMMQ is available for the VPorts exposed in the host (root partition) regardless of whether the NIC is operating in SR-IOV or VMQ mode. VMMQ is a feature available in Windows Server 2016.
https://docs.mellanox.com/pages/viewpage.action?pageId=12007112
Тем не менее, "из коробки" все работает не всегда так, как хочется, и необходимо читать про
Setting the Number of RSS Processors
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/setting-the-number-of-rss-processors
Performance tuning for low-latency packet processing
https://docs.microsoft.com/en-us/windows-server/networking/technologies/network-subsystem/net-sub-performance-tuning-nics
Conservative RSS Profile assigns 2 CPUs when 1 RSS Queue is chosen RRS feed (ссылка)
и Broadcom RSS and VMQ Tuning on Windows Servers
https://www.broadcom.com/support/knowledgebase/1211161326328/rss-and-vmq-tuning-on-windows-servers
Для Hyper-v, кроме того, необходимо изучить Dynamic Virtual Machine Queue (dVMQ) и Dynamic Virtual Machine Multi-Queue (d.VMMQ).
https://github.com/microsoft/SDN/commit/749427c97f6abaf12ac4ebe191d62978857ae9f6
https://www.chelsio.com/wp-content/uploads/resources/t6-100g-dvmmq-windows.pdf
Synthetic Accelerations in a Nutshell – Windows Server 2019
https://techcommunity.microsoft.com/t5/networking-blog/synthetic-accelerations-in-a-nutshell-windows-server-2019/ba-p/653976
Для Vmware данные задачи идут по разделу NetQueue для 5.5, 6.5 и далее.
NetQueue takes advantage of the ability of some network adapters to deliver network traffic to the system in multiple receive queues that can be processed separately, allowing processing to be scaled to multiple CPUs, improving receive-side networking performance.
https://docs.vmware.com/en/VMware-vSphere/5.5/com.vmware.vsphere.networking.doc/GUID-6B708D13-145F-4DDA-BFB1-39BCC7CD0897.html
https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.networking.doc/GUID-6B708D13-145F-4DDA-BFB1-39BCC7CD0897.html
И их настройка, включение и выключение описаны отдельно, как и ранее существовавшие проблемы, в частности:
March 23, 2017
Receive Side Scaling is not functional for vmxnet3 on Windows 8 and Windows 2012 Server or later. This issue is caused by an update for the vmxnet3 driver that addressed RSS features added in NDIS version 6.30 rendering the functionality unusable. It is observed in VMXNET3 driver versions from 1.6.6.0 to 1.7.3.0.
The Windows Receive Side Scaling (RSS) feature is not functional on virtual machines running VMware Tools versions 9.10.0 up to 10.1.5
https://blogs.vmware.com/apps/2017/03/rush-post-vmware-tools-rss-incompatibility-issues.html
И затем исправление
due to an update for the vmxnet3 driver that addressed RSS features added in NDIS version 6.30 rendered the functionality unusable. NDIS 6.30 is supported in Windows 8, Windows 2012 Server and later
https://kb.vmware.com/s/article/2149587
Можно почитать и вот эту статью -
VMware Tools 10.2.5: Changes to VMXNET3 driver settings
It was finally resolved in mid-2017 with the release of VMware Tools 10.1.7. However, only vmxnet3 driver version 1.7.3.7 in VMware Tools 10.2.0 was recommended by VMware for Windows and Microsoft Business Critical applications.
Few months after, VMware introduces the following changes to vmxnet3 driver version 1.7.3.8:
Receive Side Scaling is enabled by default,
The default value of the Receive Throttle is set to 30.
https://virtualnomadblog.com/2018/04/04/vmware-tools-10-2-5/
И там же TCP Segmentation Offload (TSO) / Large Receive Offload (LRO)
начать читать про них можно здесь:
* Information about the TCP Chimney Offload, Receive Side Scaling, and Network Direct Memory Access features in Windows Server 2008
https://support.microsoft.com/en-us/help/951037/information-about-the-tcp-chimney-offload-receive-side-scaling-and-net
* Poor network performance or high network latency on Windows virtual machines (2008925)
https://kb.vmware.com/s/article/2008925
* Understanding TCP Segmentation Offload (TSO) and Large Receive Offload (LRO) in a VMware environment (2055140)
https://kb.vmware.com/s/article/2055140
В MS server 2016 работа с сетями была улучшена:
Virtual machine multi queues (VMMQ) - теперь для одной ВМ можно сделать несколько аппаратных очередей, что улучшает быстродействие.
https://www.vmgu.ru/news/microsoft-windows-server-2016-hyper-v
Virtual Machine Multiple Queues (VMMQ), formerly known as Hardware vRSS, is a NIC offload technology that provides scalability for processing network traffic of a VPort in the host (root partition) of a virtualized node. In essence, VMMQ extends the native RSS feature to the VPorts that are associated with the physical function (PF) of a NIC including the default VPort.
VMMQ is available for the VPorts exposed in the host (root partition) regardless of whether the NIC is operating in SR-IOV or VMQ mode. VMMQ is a feature available in Windows Server 2016.
https://docs.mellanox.com/pages/viewpage.action?pageId=12007112
Тем не менее, "из коробки" все работает не всегда так, как хочется, и необходимо читать про
Setting the Number of RSS Processors
https://docs.microsoft.com/en-us/windows-hardware/drivers/network/setting-the-number-of-rss-processors
Performance tuning for low-latency packet processing
https://docs.microsoft.com/en-us/windows-server/networking/technologies/network-subsystem/net-sub-performance-tuning-nics
Conservative RSS Profile assigns 2 CPUs when 1 RSS Queue is chosen RRS feed (ссылка)
и Broadcom RSS and VMQ Tuning on Windows Servers
https://www.broadcom.com/support/knowledgebase/1211161326328/rss-and-vmq-tuning-on-windows-servers
Для Hyper-v, кроме того, необходимо изучить Dynamic Virtual Machine Queue (dVMQ) и Dynamic Virtual Machine Multi-Queue (d.VMMQ).
https://github.com/microsoft/SDN/commit/749427c97f6abaf12ac4ebe191d62978857ae9f6
https://www.chelsio.com/wp-content/uploads/resources/t6-100g-dvmmq-windows.pdf
Synthetic Accelerations in a Nutshell – Windows Server 2019
https://techcommunity.microsoft.com/t5/networking-blog/synthetic-accelerations-in-a-nutshell-windows-server-2019/ba-p/653976
Для Vmware данные задачи идут по разделу NetQueue для 5.5, 6.5 и далее.
NetQueue takes advantage of the ability of some network adapters to deliver network traffic to the system in multiple receive queues that can be processed separately, allowing processing to be scaled to multiple CPUs, improving receive-side networking performance.
https://docs.vmware.com/en/VMware-vSphere/5.5/com.vmware.vsphere.networking.doc/GUID-6B708D13-145F-4DDA-BFB1-39BCC7CD0897.html
https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.networking.doc/GUID-6B708D13-145F-4DDA-BFB1-39BCC7CD0897.html
И их настройка, включение и выключение описаны отдельно, как и ранее существовавшие проблемы, в частности:
March 23, 2017
Receive Side Scaling is not functional for vmxnet3 on Windows 8 and Windows 2012 Server or later. This issue is caused by an update for the vmxnet3 driver that addressed RSS features added in NDIS version 6.30 rendering the functionality unusable. It is observed in VMXNET3 driver versions from 1.6.6.0 to 1.7.3.0.
The Windows Receive Side Scaling (RSS) feature is not functional on virtual machines running VMware Tools versions 9.10.0 up to 10.1.5
https://blogs.vmware.com/apps/2017/03/rush-post-vmware-tools-rss-incompatibility-issues.html
И затем исправление
due to an update for the vmxnet3 driver that addressed RSS features added in NDIS version 6.30 rendered the functionality unusable. NDIS 6.30 is supported in Windows 8, Windows 2012 Server and later
https://kb.vmware.com/s/article/2149587
Можно почитать и вот эту статью -
VMware Tools 10.2.5: Changes to VMXNET3 driver settings
It was finally resolved in mid-2017 with the release of VMware Tools 10.1.7. However, only vmxnet3 driver version 1.7.3.7 in VMware Tools 10.2.0 was recommended by VMware for Windows and Microsoft Business Critical applications.
Few months after, VMware introduces the following changes to vmxnet3 driver version 1.7.3.8:
Receive Side Scaling is enabled by default,
The default value of the Receive Throttle is set to 30.
https://virtualnomadblog.com/2018/04/04/vmware-tools-10-2-5/