Skip to content

Instantly share code, notes, and snippets.

@timokroeger
Created October 29, 2024 19:39
Show Gist options
  • Select an option

  • Save timokroeger/a17bd1a1be45693f441a0ffac3e8e951 to your computer and use it in GitHub Desktop.

Select an option

Save timokroeger/a17bd1a1be45693f441a0ffac3e8e951 to your computer and use it in GitHub Desktop.

Data link layer protocol

Designed for asynchronous serial links (UART). Suitable for storage.

Target message hamming distance: 4 Any 3 bit flips within a message are detected. This includes bit flips before and after the message that might break synchronization.

Assumed serial format: 8n1 start bit (low level=0), 8 data bits, no parity bit, stop bit (high level=1)

Frame Structure

No byte within a message must have the value 0. No idle gap between bytes of a message when transmitted over a serial link.

Data message length: 4 + data len (frame delimiter not included in the message size, +2) (data encoding to remove 0 bytes not included)

Header

Length: 2byte 1bit fixed 11bit length of message 4bit crc

HD=3 or better TODO: figure out values

Fixed Data Message

┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
│STR│D0 │D1 │D2 │D3 │D4 │D5 │D6 │D7 │STP│
├───┼───┼───┼───┼───┼───┼───┼───┼───┼───┤
├───┼───┬───────────────────────────┼───┼─── Header
│ 0 │ 1 │       LEN bits 0-6        │ 1 │ length field is the number of data bytes XOR 0x7FF (invert)
├───┼───┴───────────┬───────────────┼───┤ allowed message lengths 0..=1020 (to guarantee HD4 for payload)
│ 0 │ LEN bits 7-10 │     CRC4      │ 1 │ CRC4 of LEN field, poly: 0x3, init: 0, refin/refout: true, xorout: 0 (cannot be zero when LEN<=1934)
├───┼───────────────┴───────────────┼───┼─── Payload
│ 0 │         DATA byte 0           │ 1 │ must not contain 0 bytes, e.g. COBS encoded
├───┼───────────────────────────────┼───┤
│ 0 │         DATA byte 1           │ 1 │
├───┼───────────────────────────────┼───┤
│ 0 │         DATA byte ...         │ 1 │
├───┼───────────────────────────────┼───┤
│ 0 │         DATA byte n-1         │ 1 │
├───┼───┬───────────────────────────┼───┼─── 
│ 0 │ 1 │     CRC14 bits 0-7        │ 1 │ poly: 0x6E57, init: 0, refin/refout: true, xorout: 0
├───┼───┴───────────────────────┬───┼───┤ header bytes and high bit D0 included in the calculation
│ 0 │        CRC14 bits 8-13    │ 1 │ 1 │ padding bits to ensure nonzero bytes
├───┼───────────────────────────┴───┼───┼─── SYNC Option A
│ 0 │             0x00              │ 0 │ Break character
├───┼───────────────────────────────┼───┼─── SYNC Option B
│ 0 │             0x00              │ 1 │
├───┼───────────────────────────────┼───┤ Two 0 bytes
│ 0 │             0x00              │ 1 │
└───┴───────────────────────────────┴───┴───

Length: 4 + n where n is the number of data bytes

Data

Length: fixed (5 or 6 predefined message lengths encoded in the header) OR additional length indicator maybe varint encoding. -> Define fixed message length as required for the upper layer. But there should be one with len=7 for which the 14bit crc has HD=6. Data must not contain any 0 byte.

Hint: upper layer protocol should use an escaping protocol like COBS. COBS! Idea reduced variant: max len of payload 254 -> always exactly 1 byte overhead, header len%256!=0 in all cases Full variants: always assume max COBS overhead (e.g. ceil(len/254)) and add 0xFF padding bytes at the end. Decoder has to become a little smarter but its possible encode everything in one pass header len%256!=0 can be enforced with padding bytes TODO: Idea, two spare bits per COBS link available, use for version indication

CRC

Length: 2 bytes 14bit CRC with polynomial: 0x6E57 for a hamming distance of 4 for up to 8176 data bits (covers all specified message sizes). CRC is calculatad from all bytes in the message including the header byte and the data. Initial value of 0x0000. The resulting CRC value is split into two 7bit parts which are placed in the LSBs (bit positions 0..6) of each CRC byte. The MSB of each CRC byte (bit position 7) is set to 1 to ensure that none of the CRC bytes can become 0.

SYNC (frame delimiter)

Length: 2 byte Value: 0x00 0x00 The value 0 cannot appear within the message, that means at least two bits must be flipped for a false SYNC to be detected. The next two bytes in the stream are then evaluated as header. For random data there is a 1/36 chance that the value is valid and the decoding continues. Can it happen that decoding hits another SYNC delimiter exactly? if yes -> HD=3 fail, if no, whats the actual HD for the full message?

--- side note --- example data (hex): xx 01 04 xx xx xx xx xx xx xx 00 00 original message hdr ---------data------- -CRC- 00 00 corrupted message 00 00 hdr ---data--- -CRC- 00 00 Two bit flips in the original message create two 00 00 bytes which the decoder sees as SYNC. The third byte of the original message is a valid header byte which indicates message length 4. The boundary of the corrupted and the original message are the same. Is it possible that the corrupted message can have the same CRC as the original message?

Problem description: Given a message with length N for which is protected by a M bit CRC checksum. When removing up to N number of bytes from the start of the message, is it possible for the CRC value to stay the same? My intuition says not possible with cobs encoding.

Trivial answer: Yes, when the CRC initial value is 0 and the original message started with 0 bytes. When the CRC initial value is non-zero is the CRC guaranteed to change for the shorter messages?

-> Possible solution: When corrupted message observed, skip the next valid message.

To improve synchronization capabilities implementations using a serial link can additionally use idle frame detection or break characters after the SYNC bytes. If all participants on the serial link a agree, the SYNC bytes can be replaced by a UART break.

Decoding

Check start byte, if not valid discard data until receiving the next 0 byte (or break is detected). Data message:

  • Check if frame delimiter matches the indicated payload length.
  • Validate reserved bits in CRC field and check CRC

Sync Loss

When the receiver misses the high->low transition of a start bit it misinterprets the next 0bit in the byte as start bit and we receive shifted (speak random) data. This means a single bit error actually triggers a 8bit burst error. When the transmitter sends data with 2 stop bits its possible for the receiver to observe this condition without framing error. That means that two bit flips can make the message pass CRC because the CRC cannot cover two 8bit burst errors always.

This means the transmitter must send bytes within a message back to back (e.g. start bit immediately after stop bit of previous byte). When a receiver loses synchronization it will re-sync after 9 high bits. Those can either be part of a single transmitted 0xFF byte or be the sum of all leading 1bits of multiple bytes. In any case the number of received bytes is less the the number of sent bytes. A condition handled by the length field of the header.

TODO: de-sync in length field

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment