A technical paper describes how to troubleshoot TCP unidirectional data transfer throughput problems. Solving these problems that are seen on the vSphere/ESXi host can lead to improved performance.The paper is aimed at developers, advanced admins, and tech support specialists.
Taken from the introduction:
Data transfer over TCP is very common in vSphere environments. Examples include storage traffic between the VMware ESXi host and an NFS or iSCSI datastore, and various forms of vMotion traffic between vSphere datastores.
We have observed that even extremely infrequent TCP issues could have an outsized impact on overall transfer throughput. For example, in our experiments with ESXi NFS read traffic from an NFS datastore, a seemingly minor 0.02% packet loss resulted in an unexpected 35% decrease in NFS read throughput.
In this paper, we describe a methodology for identifying TCP issues commonly responsible for poor transfer throughput. We capture the network traffic of a data transfer into a packet trace file for offline analysis. This packet trace is analyzed for signatures of common TCP issues that may have a significant impact on transfer throughput.
The TCP issues considered include packet loss and retransmission, long pauses due to TCP timers, and Bandwidth Delay Product (BDP) issues. We use Wireshark to perform the analysis and provide a Wireshark profile to simplify the analysis workflow. We describe a systematic approach to identify common TCP issues with significant transfer throughput impact, and recommend engineers troubleshooting data transfer throughput performance to include this methodology as a standard part of their workflow.
The paper features troubleshooting workflows, reference tables, and steps to guide you. The paper’s author also created a Wireshark profile with predefined display filters and I/O graphs to make the steps easier for you.
For more information, please read Troubleshooting TCP Unidirectional Data Transfer Throughput: Packet Trace Analysis Using Wireshark.