Howdy,
- I have an ARM Cortex A9 CPU, the NXP iMX6, dual core,
- on which runs a bare bones Linux (kernel 4.1.44).
- I'm trying to receive data over its Gigabit ethernet interface.
- goal throughput is ~ 40 MByte/s (320 Mbit/s)
- my default UDP payload is ~ 1350 bytes, I've experimented with sizes from 300 to 16k Bytes
- the ARM board is connected, via one Gigabit switch, to my PC
- nothing else on the switch
- on the PC runs the UDP client, sending data continuously
-
the ARM runs the server which binds a socket to a port and then basically only does
- while(true) { poll( socket, ...); recv( socket, ...); }
- there is some other stuff going on like loop time jitter histogram and crude data integrity check, but commenting that out yields no difference in the problematic behavior
I.e. I'm using the socket library as available on Linux.
Now, according to what I've read, poll(..) blocks (in a non-busy manner) until there is data. I use it because it has timeout functionality.
So from that, I should not be seeing the near maxing out of the CPU (90..99 % for my process) that I unfortunately do. Unless that throughput really is too much for that ARM CPU.
But I doubt that, as iperf (2.0.5) eats "only" 50..56% CPU at ~ 40MB/s datarate. I'm not sure what exactly it does, though, looked briefly at the code and saw some fiddling with raw sockets, not sure I want to go that route... I have hardly any experience with network stuff.
Also, there was this strange effect of high packet loss, despite this short connection, while a second PC as UDP receiver has no packet loss.
I'd get it if the CPU were at 100% all the time, but it's slightly below - and iperf much less, which still can get like 2..4% loss.
I then found some Linux settings: net.core.rmem_max and net.core.rmem_default, which I set to 8 MB each via sysctl, instead of some KB it had prior.
Then the packet loss went to zero (still around 90+ % CPU load).
Today I tried to replicate this, put I have packet loss again. I had also put the cheap plastic switch somewhere else, now I'm suspecting that that thing may be unreliable... yet to test.
Can switches be a problem source like that?
Any ideas about why the CPU load is so high?
Or, in other words, am I doing this totally wrong? What would a proper implementation of relatively high speed UDP reception look like?
Regards,
- UB