Added address encoding definition.

pull/3/head
Mark Smith 2019-12-08 12:22:25 -08:00
rodzic dc36147bce
commit 673eefd8c7
1 zmienionych plików z 118 dodań i 2 usunięć

Wyświetl plik

@ -41,8 +41,8 @@ The M17 Packet format borrows heavily from Ethernet, except the Preamble and Syn
* **TODO** Something long and arbitrary, like e or Pi.
* Packet Indicator: 1 byte
* A value to indicate this is a Packet, not a Stream.
* Destination address: 6 bytes
* Source address: 6 bytes
* Destination address: 6 bytes (See below for address encoding.)
* Source address: 6 bytes (See below for address encoding.)
* Length: 2 bytes (Number of bytes in payload)
* Payload: N bytes
* CRC: 4 bytes (32-bit CRC of the entire frame, not including the preamble or sync byte. Includes Destination, Source, Lengh, and Payload.)
@ -89,3 +89,119 @@ CODEC2 3200 needs to send a 64 bit, 4 byte, CODEC frame every 20ms. Stream fram
### File Transfer Stream
# Address Encoding
M17 addresses are 48 bits, 6 bytes long. Callsigns (and other addresses) are encoded into these 6 bytes in the following ways:
* An address of 0 is invalid.
* **TODO** Do we want to use zero as a flag value of some kind?
* Address values between 1 and 262143999999999 (which is (40^9)-1), up to 9 characters of text are encoded using base40, described below.
* Address values between 262144000000000 (40^9) and 281474976710655 ((2^48)-1) are invalid
* **TODO** Can we think of something to do with these 19330976710655 addresses?
* An address of 0xFFFFFFFFFFFF is a broadcast. All stations should receive and listen to this message.
## Callsign Encoding: base40
9 characters from an alphabet of 40 possible characters can be encoded into 48 bits, 6 bytes. The base40 alphabet is:
* 0: An invalid character, something not in the alphabet was provided.
* 1-26: 'A' through 'Z'
* 27-36: '0' through '9'
* 37: '-'
* 38: '/'
* 39: TBD
Encoding is little endian. That is, the right most characters in the encoded string are the most significant bits in the resulting encoding.
### Example code: encode_base40()
```
uint64_t encode_callsign_base40(const char *callsign) {
uint64_t encoded = 0;
for (const char *p = (callsign + strlen(callsign) - 1); p >= callsign; p-- ) {
encoded *= 40;
// If speed is more important than code space, you can replace this with a lookup into a 256 byte array.
if (*p >= 'A' && *p <= 'Z') // 1-26
encoded += *p - 'A' + 1;
else if (*p >= '0' && *p <= '9') // 27-36
encoded += *p - '0' + 27;
else if (*p == '-') // 37
encoded += 37;
// These are just place holders. If other characters make more sense, change these.
// Be sure to change them in the decode array below too.
else if (*p == '/') // 38
encoded += 38;
else if (*p == '.') // 39
encoded += 39;
else
// Invalid character, represented by 0.
//encoded += 0;
;
}
return encoded;
}
```
### Example code: decode_base40()
```
char *decode_callsign_base40(uint64_t encoded, char *callsign) {
if (encoded >= 262144000000000) { // 40^9
*callsign = 0;
return callsign;
}
char *p = callsign;
for (; encoded > 0; p++) {
*p = "xABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-/."[encoded % 40];
encoded /= 40;
}
*p = 0;
return callsign;
}
```
### Why base40?
The longest commonly assigned callsign from the FCC is 6 characters. The minimum alphabet of A-Z, 0-9, and a "done" character mean the most compact encoding of an American callsign could be: log2(37^6)=31.26 bits, or 4 bytes.
But I'm not convinced that 6 character is a global maximum. Also, we want to extend our callsigns (see below). So we want more than 6 characters. How many bits do we need to represent more characters:
* 7 characters: log2(37^7)=36.47 bits, 5 bytes
* 8 characters: log2(37^8)=41.67 bits, 6 bytes
* 9 characters: log2(37^9)=46.89 bits, 6 bytes
* 10 characters: log2(37^10)=52.09 bits, 7 bytes.
Of these, 9 characters into 6 bytes seems the sweet spot. Given 9 characters, how large can we make the alphabet without using more than 6 bytes?
* 37 alphabet: log2(37^9)=46.89 bits, 6 bytes
* 38 alphabet: log2(38^9)=47.23 bits, 6 bytes
* 39 alphabet: log2(39^9)=47.57 bits, 6 bytes
* 40 alphabet: log2(40^9)=47.90 bits, 6 bytes
* 41 alphabet: log2(41^9)=48.22 bits, 7 bytes
Given this, 9 characters from an alphabet of 40 possible characters, makes maximal use of 6 bytes.
## Callsign Formats
Government issued callsigns should be able to encode directly with no changes.
### Multiple Stations
To allow for multiple stations by the same operator, we borrow the use of the '-' character from AX.25. A callsign such as "KR6ZY-1" is considered a different station than "KR6ZY-2" or even "KR6ZY", but it is understood that these all belong to the same operator, "KR6ZY."
### Temporary Modifiers
Similarly, suffixes are often added to callsign to indicate temporary changes of status, such as "KR6ZY/M" for a mobile station, or "KR6ZY/AE" to signify that I have Amateur Extra operating privileges even though the FCC database may not yet be updated. So the '/' is included in the base40 alphabet.
The difference between '-' and '/' is that '-' are considered different stations, but '/' are NOT. They are considered to be a temporary modification to the same station. **TODO** I'm not sure what impact this actually has.
### Interoperability
It may be desirable to bridge information between M17 and other networks. The 9 character base40 encoding allows for this:
**TODO** Define more interoperability standards here. System Fusion? P25? IRLP? AllStar?
#### DMR
DMR unfortunately doesn't have a guaranteed single name space. Individual IDs are reasonably well recognized to be managed by https://www.radioid.net/database/search#! but Talk Groups are much less well managed. Talk Group XYZ on Brandmeister may be (and often is) different than Talk Group XYZ on a private cBridge system.
* DMR IDs are encoded as: `D<number>` eg: `D3106728` for KR6ZY
* DMR Talk Groups are encoded by their network. Currently, the following networks are defined:
* Brandmeister: `BM<number>` eg: `BM31075`
* More networks to be defined here.
#### D-Star
D-Star reflectors have well defined names: REFxxxY which are encoded directly into base40.
**TODO** Individuals? Just callsigns?
#### Interoperability Challenges
* We'll need to provide a source ID on the other network. Not sure how to do that, and it'll probably be unique for each network we want to interoperate with. Maybe write the DMR/BM gateway to automatically lookup a callsign in the DMR database and map it to a DMR ID? Just thinking out loud.
* We will have to transcode CODEC2 to whatever the other network uses (pretty much AMBE of one flavor or another.) I'd be curious to see how that sounds.