Ostinato throughput on a 40G NIC
Update (2021): Ostinato now supports 10G, 25G and 40Gbps line-rate rate traffic using the Turbo Transmit add-on.
A few weeks ago I attended the DPDK Summit in Bangalore, where M Jay from Intel loaned me a dual-port 40G interface to enable Ostinato to become a high-speed packet generator using DPDK. Thanks M Jay and Intel!
If you’ve been following Ostinato for a long time, you might recall I had done a prototype of Ostinato with DPDK back in 2014 which won the 6Wind DPDK Design contest. That work was done on a VM using VNICs instead of physical NICs and the lack of the latter (amongst other things) meant I never productized that code.
Before I (re)start the DPDK work on Ostinato, I wanted to get some baseline measurements with the existing libpcap based code. This blog post presents these results.
You can also look at the results for a 1G port for Linux, MacOS, Live ISO.
I configured the same single stream as in previous tests -
- Protocols: Mac, Ethernet II, IPv4, UDP, Pattern
- Source and Destination Mac addresses populated with the actual Mac addresses of the source and sink
- Source and Destination IP addresses populated with the actual IP addresses of the source and sink
- Packets/sec: 0
- Number of packets: 10 (default)
- After this stream: Goto first
One difference from the previous tests is that I’m using a single host with a dual-port NIC with the ports connected back to back. Ostinato packets sent on port 1 are received back on port 2 on the same host.
Also, I’m adding two jumbo sized packet sizes to this test - 4096 bytes and 9018 bytes.
Performance figures
Here are the numbers -
Packet Size | Send Rate (Kpps) | Recv Rate (Kpps) | Send Rate (Mbps) | Recv Rate (Mbps) |
---|---|---|---|---|
64 | 755 | 630 | 507 | 423 |
128 | 753 | 640 | 892 | 758 |
256 | 738 | 635 | 1,630 | 1,402 |
512 | 732 | 636 | 3,115 | 2,707 |
1024 | 725 | 642 | 6,055 | 5,362 |
1518 | 725 | 639 | 8,920 | 7,862 |
4096 | 574 | 566 | 18,901 | 18,637 |
9018 | 225 | 225 | 16,268 | 16,268 |
Mbps rate was calculated using the below formula to take into account the ethernet line overhead of 20 bytes (1 byte SFD + 7 byte preamble + 12 byte inter-packet gap) -
Rate (in Mbps) = (PacketSize + 20) * KppsRate * 8 / 1000
Specs
Software
- Ostinato 0.9 - for Linux (Ubuntu 16.04, 64-bit)
- Ubuntu 16.04.3 LTS
Hardware
- Processor: Xeon E5-2620v4 @ 2.10GHz (8-core)
- RAM: 32GB
- NIC: Intel XL710 40Gbps dual-port PCI 3.0, x8
Observations
- The first that jumped out was that there was a lot of fluctuation in the send rate while transmitting e.g. for 256 bytes packets, I saw rates ranging from 635Kpps to 728Kpps. This is a very wide range and I don’t have an explanation. I have used the lower end of the range seen in the above results for all packet sizes.
- The second thing that jumps out from the results table is that receive rate gets capped at ~640Kpps even if the send rate is higher. Again, I don’t have an explanation for the same.
- The next thing that is suprising is that the max rate is achieved for 4096 packets, instead of 9018.
- Although it’s a 8-core CPU, only two core reach 100% CPU during the test - this is expected since for a single port, Ostinato doesn’t use more than one core and we have both transmit and receive port on the same host. The first core with 100% cpu utilization (presumably the one handling the tx port) has 15% user and 85% system usage - indicating that the major time is spent in the kernel driver. The second core had 100% utilization in softirq - presumably the one handling the receive port
- RAM usage not measured
- One heartening observation is that Ostinato (even without DPDK) can be used as a 10G traffic generator - if you can use jumbo frames.
For more Ostinato posts, subscribe for email updates.
Leave a Comment