Message fragmentation over serial port

This forum is for users of Microchip MPUs and who are interested in using Linux OS.

Moderator: nferre

jscottbosch
Posts: 2
Joined: Fri May 31, 2019 9:59 pm

Message fragmentation over serial port

Tue Jun 04, 2019 9:19 pm

We're having an issue where messages are being split by the serial driver. We're suspecting the atmel-serial driver, but I want to run this by the community to see if anyone's seen this before. I did find one somewhat similar post, but that situation was a little different in that the application was pumping a continuous stream of data through the serial connection, whereas we are sending discrete messages.

I've tried to provide as much info as possible to describe the problem and what I've done to try to fix it. Any help would be greatly appreciated.


Configuration:
  • Atmel ATSAMA5D27
  • Linux 4.14.73 (see uname output below)
  • Communications daemon sits between an external communication interface and a TI DSP.
  • Using UART0 115200/n/8/1 with DMA enabled to exchange messages with the TI DSP.
  • Using termios.
  • Using select() then read if data is ready.
  • Message length varies between 12 bytes and 28 bytes, depending on message type.
  • Messages begin with a start of frame delimiter (SoF) and an address but these are not currently used to mark start of messages.
  • RX DMA: atmel_usart_serial atmel_usart_serial.0.auto: using dma0chan8 for rx DMA transfers
  • TX DMA: atmel_usart_serial atmel_usart_serial.0.auto: using dma0chan9 for tx DMA transfers

Description of Problem:
  1. Message processing works fine until fragmentation occurs.
  2. System recovers from fragmentation.
  3. Messages are being fragmented (split) in a very consistent manner:
    - Each splitting of a message always occurs after a multiple of 2048 bytes are received.
    - The split seems to occur when a message straddles the 2k boundary.
    - Messages are usually split in two, but sometimes three segments. The values of the bytes in the segments is correct for their position in a valid message.
    - A split message fragment will often have the SoF and address bytes of the next message, along with some of the payload.
    - Logic analyzer traces reveal no interbyte idle times long enough to signal end-of-message (they vary between 8 and 9uS).

What I've Tried (beyond reading every relevant post I can find):
  1. Disable canonical mode
  2. Set minimum bytes to the smallest message size (VTIME set to 12)
  3. Various settings of timeout (VTIME)
  4. Disable/enable O_NDELAY
  5. Captured traffic in logic analyzer. No apparent large interbyte gaps to trigger end of message.
  6. Disabled DMA (to see if error went away; it didn't, but the split point changed from the 2k boundary to the same byte in one of the longer messages, despite no apparent gap in the serial transmission capture)
  7. Varying interbyte and intermessage idle periods. I can cause end of message but only after adding a long delay, long enough to exceed the timeout period (0.1s, for example)

uname Output:

Code: Select all

Linux radian 4.14.73-linux4sam_6.0 #1 Mon Jun 3 09:12:16 PDT 2019 armv7l GNU/Linux

stty Output:

Code: Select all

# stty -a < /dev/serial-dsp
speed 115200 baud;stty: standard input
 line = 0;
intr = ^C; quit = ^\; erase = ^?; kill = ^U; eof = ^D; eol = <undef>;
eol2 = <undef>; swtch = <undef>; start = ^Q; stop = ^S; susp = ^Z; rprnt = ^R;
werase = ^W; lnext = ^V; flush = ^O; min = 12; time = 2;
-parenb -parodd cs8 hupcl -cstopb cread -clocal -crtscts
-ignbrk -brkint -ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon
-ixoff -iuclc -ixany -imaxbel -iutf8
-opost -olcuc -ocrnl onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0
ff0
-isig -icanon -iexten -echo echoe echok -echonl -noflsh -xcase -tostop -echoprt
echoctl echoke

Sample Log Output Excerpt (DMA enabled):
NOTES
1. Split occurs where bytes between last message fragment and next error are multiples of 2048 (10 + 8182 = 8192 = 2048 * 4).
2. After last fragment received, messages will be processed correctly for times up to 21:20 (I can control this somewhat with termios settings).
3. By examining the length "f" of the last fragments, I can predict with 100% accuracy the length of the interval between errors will be (n * 2048 - f).

Code: Select all

Jan 11 21:56:56  process_dsp: packet CRC invalid: Received=0x0019, Calculated=0x66AD; packet length=10
Jan 11 21:56:56  _dsp_log_error_stats: ****************************************************************************************
Jan 11 21:56:56  _dsp_log_error_stats: * SINCE LAST ERROR:
Jan 11 21:56:56  _dsp_log_error_stats: *    : ticks: 0; seconds: 1; bytes: 10; messages: 0
Jan 11 21:56:56  _dsp_log_error_stats: ****************************************************************************************
Jan 11 21:56:56     packet[0] = 0xDC
Jan 11 21:56:56     packet[1] = 0x05
Jan 11 21:56:56     packet[2] = 0x19
Jan 11 21:56:56     packet[3] = 0x00
Jan 11 21:56:56     packet[4] = 0x14
Jan 11 21:56:56     packet[5] = 0x00
Jan 11 21:56:56     packet[6] = 0xE4
Jan 11 21:56:56     packet[7] = 0x0C
Jan 11 21:56:56     packet[8] = 0xFF
Jan 11 21:56:56     packet[9] = 0x00
Jan 11 21:56:56  process_dsp: Invalid command code received: 0x0ce4 
Jan 11 22:04:40  process_dsp: packet CRC invalid: Received=0x908B, Calculated=0x6B26; packet length=26
Jan 11 22:04:40  _dsp_log_error_stats: ****************************************************************************************
Jan 11 22:04:40  _dsp_log_error_stats: * SINCE LAST ERROR:
Jan 11 22:04:40  _dsp_log_error_stats: *    : ticks: 18529; seconds: 464; bytes: 8182; messages: 557
Jan 11 22:04:40  _dsp_log_error_stats: ****************************************************************************************


Sample Log Output Excerpt (DMA disabled):

With DMA disabled, the longer message (status message sent every five seconds) is always split at the same place with the same number of bytes between errors:

Code: Select all

Jan  1 01:30:54 _dsp_log_error_stats: ****************************************************************************************
Jan  1 01:30:54 _dsp_log_error_stats: * SINCE LAST ERROR:
Jan  1 01:30:54 _dsp_log_error_stats: *    : ticks: 192; seconds: 5; bytes: 78; messages: 5
Jan  1 01:30:54 _dsp_log_error_stats: ****************************************************************************************
Jan  1 01:30:54 process_dsp: ***** CRC PACKET CAPTURE ***** (len = 18)
<snip>
Jan  1 01:30:54 process_dsp: packet CRC invalid: Received=0x0019, Calculated=0x66AD; packet length=10
Jan  1 01:30:54 _dsp_log_error_stats: ****************************************************************************************
Jan  1 01:30:54 _dsp_log_error_stats: * SINCE LAST ERROR:
Jan  1 01:30:54 _dsp_log_error_stats: *    : ticks: 0; seconds: 0; bytes: 10; messages: 0
Jan  1 01:30:54 _dsp_log_error_stats: ****************************************************************************************
Jan  1 01:30:54 process_dsp: ***** CRC PACKET CAPTURE ***** (len = 10)
<snip>
Jan  1 01:30:59 process_dsp: packet CRC invalid: Received=0x908B, Calculated=0x2C04; packet length=18
Jan  1 01:30:59 _dsp_log_error_stats: ****************************************************************************************
Jan  1 01:30:59 _dsp_log_error_stats: * SINCE LAST ERROR:
Jan  1 01:30:59 _dsp_log_error_stats: *    : ticks: 191; seconds: 5; bytes: 78; messages: 5
Jan  1 01:30:59 _dsp_log_error_stats: ****************************************************************************************
Jan  1 01:30:59 process_dsp: ***** CRC PACKET CAPTURE ***** (len = 18)
<snip>
Jan  1 01:30:59 process_dsp: packet CRC invalid: Received=0x0019, Calculated=0x66AD; packet length=10
Jan  1 01:30:59 _dsp_log_error_stats: ****************************************************************************************
Jan  1 01:30:59 _dsp_log_error_stats: * SINCE LAST ERROR:
Jan  1 01:30:59 _dsp_log_error_stats: *    : ticks: 0; seconds: 0; bytes: 10; messages: 0
Jan  1 01:30:59 _dsp_log_error_stats: ****************************************************************************************
Jan  1 01:30:59 process_dsp: ***** CRC PACKET CAPTURE ***** (len = 10)
blue_z
Location: USA
Posts: 1988
Joined: Thu Apr 19, 2007 10:15 pm

Re: Message fragmentation over serial port

Tue Jun 04, 2019 11:37 pm

jscottbosch wrote: stty Output:

Code: Select all

# stty -a < /dev/serial-dsp
That's not a standard device node. What does it represent?
That's not standard syntax for that command.
jscottbosch
Posts: 2
Joined: Fri May 31, 2019 9:59 pm

Re: Message fragmentation over serial port

Wed Jun 05, 2019 2:53 am

blue_z wrote:That's not a standard device node. What does it represent?
Exactly what is described in my original post.

blue_z wrote:That's not standard syntax for that command.
It is according to Stevens (Advanced Programming in the UNIX Environment, Second Edition, p.652).
blue_z
Location: USA
Posts: 1988
Joined: Thu Apr 19, 2007 10:15 pm

Re: Message fragmentation over serial port

Fri Jun 07, 2019 3:26 am

jscottbosch wrote: Exactly what is described in my original post.
Apparently you think that your post has a through and complete description.
Your post is lengthy, but is ambiguous and omits many salient details.

For instance you have provided a list of what you have "tried". There's no mention of what you "use".
When you present two logs, the only description of the operatiing conditions are "DMA enabled" or "DMA disabled".
There's no mention of the termios configuration (is it the previous stty output?) or if blocking or nonblocking I/O is in use.

One item in your list of what you "tried" is contradicted by the stty command output.
Stty indicates non-canonical mode.
If you tried to "disable canonical mode", that would mean you "tried" non-canonical mode, but are using (or reverted back to) canonical mode.
Your writing is not as exact as you assume.

You repeatedly refer to "fragments" which is an application perspective.
Presumably these "fragments" are your description of the returned data from a read() syscall.
Referring to a normal syscall response as an "error" is misleading.

All I asked for was a clarification of a nonstandard device node in a post that is IMO full of ambiguity.

jscottbosch wrote: We're having an issue where messages are being split by the serial driver. We're suspecting the atmel-serial driver, but I want to run this by the community to see if anyone's seen this before.
Yes, I'm very familiar with this type of behavior from serial terminals.
No, your suspicion of the serial driver is incorrect.
If your /dev/serial-dsp is simply a rename or link to an ordinary serial terminal device node, then the issue is your misunderstanding and incorrect configuration.
Good luck solving your problem.


Regards

Return to “LINUX”

Who is online

Users browsing this forum: No registered users and 6 guests