Network interface freeze after random time on SAMA5D4

This forum is for users of Microchip MPUs and who are interested in using Linux OS.

Moderator: nferre

ylanz
Posts: 24
Joined: Mon May 05, 2014 12:57 pm

Network interface freeze after random time on SAMA5D4

Wed May 27, 2020 11:42 pm

We have a custom board heavily based on the SAMA5D4-Xplained and after some time, the network interface becomes unusable.
The hardware is based on the SAMA5D41 with the KSZ8081RNB as PHY.

The symptoms:
  • For a variable time (from hours to weeks), the network interface is running without any issue nor RX/TX error
    Tested with 10/100Mbps half/full
  • After this random time, the network interface becomes unusable:
    • No more usable RX packets are received (not tested the TX)
    • An ifdown/ifup fix the issue
    • No more IRQ generated in the Cadence macb driver
    • Most of all, the RX Resource Errors counter is incrementing every received packet
  • The "MTBF" is really long on Linux kernel 4.4, 4.9. 14 (3-6 weeks without issue) and has decrease recently when trying the kernel 4.19 and 5.4 to 1-4 hours which is really good to have better "chance" to debug the issue
My first supposition was related to the hardware since the macb driver is used since a very long time on a lot of material but after been struggling for days now, I discover the following text in the SAMA5D4's datasheet (Chapter 36.6.3.1 Receive AHB Buffers):
If bit zero of the receive buffer descriptor is already set when the receive buffer manager reads the location of the
receive AHB buffer, then the buffer has been already used and cannot be used again until software has processed
the frame and cleared bit zero. In this case, the “buffer not available” bit in the receive status register is set and an
interrupt triggered. The receive resource error statistics register is also incremented.
If this error happens, a RX complete interrupt is generated and the bit BNA of the Receive Status Register is supposed to be set in order to manage this error, free the input buffer to release the pointer to let the MAC do its job.

After some researches in the Cadence MAC driver used for the SAMA5D4 (drivers/net/ethernet/cadence/macb_main.c), I find the define for the BNA bit but it's not used anywhere in the driver code which lets me think this error (which can happen only in rare situation) is not managed.

That put altogether makes me think I'm not facing any hardware error but definitely a bug of driver but I'm still surprised I'm the only one...

Any help or comment would be really much appreciated, I'm not sure about my assumptions and I don't really know how to reach the maintainers of the driver if this conclusion was finally plausible.
ylanz
Posts: 24
Joined: Mon May 05, 2014 12:57 pm

Re: Network interface freeze after random time on SAMA5D4

Thu May 28, 2020 11:19 am

By adding a check of the BNA bit in the macb_poll and macb_interrupt functions, I have been able to prove my assumptions.
After some time, the ethernet interface goes in frozen state as usual, the rx_resource_errors are incrementing and I get the from dmesg an error regarding the BNA bit as expected.

I will try to patch that but does somebody know how to reach the maintainers ?
CleberPeter
Location: Brazil
Posts: 20
Joined: Tue May 14, 2019 7:57 pm

Re: Network interface freeze after random time on SAMA5D4

Thu May 28, 2020 1:33 pm

When I make any improvements I have made a pull-request to the Atmel kernel repository:

https://github.com/linux4sam/linux-at91

Coincidentally, the last change in the repository was in the macb_main.c file to ensure that the macb is not suspended:

https://github.com/linux4sam/linux-at91 ... ffe56a8ecd
ylanz
Posts: 24
Joined: Mon May 05, 2014 12:57 pm

Re: Network interface freeze after random time on SAMA5D4

Thu May 28, 2020 6:10 pm

Thanks! I was thinking about mailing list but Github is probably simpler if it's ok this way.

By the way, the other patch is unfortunately not related even if the message looks close...

Return to “LINUX”

Who is online

Users browsing this forum: Bing [Bot] and 1 guest