The hardware is based on the SAMA5D41 with the KSZ8081RNB as PHY.
- For a variable time (from hours to weeks), the network interface is running without any issue nor RX/TX error
Tested with 10/100Mbps half/full
- After this random time, the network interface becomes unusable:
- No more usable RX packets are received (not tested the TX)
- An ifdown/ifup fix the issue
- No more IRQ generated in the Cadence macb driver
- Most of all, the RX Resource Errors counter is incrementing every received packet
- The "MTBF" is really long on Linux kernel 4.4, 4.9. 14 (3-6 weeks without issue) and has decrease recently when trying the kernel 4.19 and 5.4 to 1-4 hours which is really good to have better "chance" to debug the issue
If this error happens, a RX complete interrupt is generated and the bit BNA of the Receive Status Register is supposed to be set in order to manage this error, free the input buffer to release the pointer to let the MAC do its job.If bit zero of the receive buffer descriptor is already set when the receive buffer manager reads the location of the
receive AHB buffer, then the buffer has been already used and cannot be used again until software has processed
the frame and cleared bit zero. In this case, the “buffer not available” bit in the receive status register is set and an
interrupt triggered. The receive resource error statistics register is also incremented.
After some researches in the Cadence MAC driver used for the SAMA5D4 (drivers/net/ethernet/cadence/macb_main.c), I find the define for the BNA bit but it's not used anywhere in the driver code which lets me think this error (which can happen only in rare situation) is not managed.
That put altogether makes me think I'm not facing any hardware error but definitely a bug of driver but I'm still surprised I'm the only one...
Any help or comment would be really much appreciated, I'm not sure about my assumptions and I don't really know how to reach the maintainers of the driver if this conclusion was finally plausible.