Thursday, September 22, 2016

iperf3 and microbursts, part II

Previously, I talked about iPerf3's microbursts, and how changing the frequency of the timer could smoothen things out.

One of the first feedback items that I got, and lined up with some musings of my own, was if it might be better to calculate the timer based on the smoothest possible packet rate:

PPS = Rate / Packet Size

This is, btw, what iPerf3 does automatically on Linux, when the fq socket scheduler is available.  So this is really just seeing if we can fake it from user-land on systems that don't have it (like my OSX system)

Luckily, adding this to iPerf's code is pretty easy.

100Mbps


To recap, using a 1ms timer and 100Mbps with a 1K udp packet size results in the following:
Zooming in:

Switching from 1ms pacing to 819us pacing (the results of the calculated pacing), nets:
And zooming in:

And, I should be careful, because I'm quantizing this into buckets that are the same size as the timer...  I should probably be subdividing more, or much less, to get a better view of what's going on.  But I'm going to stick with the 1ms buckets for this analysis (for data presentation consistency).

500Mbps

I've only been showing 100Mbps results, but I really should document how it works at higher speeds, especially something closer to the line rate.  So here's what 500Mbps looks like through all these changes.

Stock:

1ms timer:

And then using a calculated 16µs timer:


Much, much smoother pacing.

However, there's still some big overshoots, but that's due to how the iperf3 red-light/green-light algorithm reacts to stumbles (or late-firing timers).  It sends more packets until it catches back up.  At the micro-scale, this isn't a big deal, but it can cause the tool to stick in "green-light" mode when testing through congested links and it can't actually maintain the desired rate.

I've setup a new branch on my GitHub fork to play around with this.  Including capping the maximum frequency (with a command-line param to change it).  The cap is specified in µs, as that's what the POSIX api lets you use.

Now to get some captures from a Linux box with fq pacing, and show how well it performs.