kopia lustrzana https://github.com/mobilinkd/m17-cxx-demod
Add Jupyter notebook to compute OPV numerology
rodzic
354504b496
commit
50510b0eae
|
@ -0,0 +1,303 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Opulent Voice Numerology\n",
|
||||
"\n",
|
||||
"For now, the opv-cxx-demod package supports a single fixed set of parameter choices (one \"numerology\"). This notebook computes them and documents the process of choosing them.\n",
|
||||
"\n",
|
||||
"## Nomenclature\n",
|
||||
"`Opus` is the chosen vocoder; `OPV` is short for ***Opulent Voice*** (see what we did there?) and refers to the overall framing, coding, and modulation scheme.\n",
|
||||
"\n",
|
||||
"Following M17, we define the following types of bits:\n",
|
||||
"| Type | Description |\n",
|
||||
"|-------|-----------------------------------------------|\n",
|
||||
"| type1 | initial data bits |\n",
|
||||
"| type2 | bits after encoding |\n",
|
||||
"| type3 | encoded bits after puncturing (if applicable) |\n",
|
||||
"| type4 | decorrelated and interleaved type3 bits |\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import math\n",
|
||||
"import sympy"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Vocoder Bit Rate\n",
|
||||
"\n",
|
||||
"`opus_bitrate` is the fixed bit rate the vocoder is configured to use. For now, we are operating Opus in CBR (constant bit rate) mode, so it always uses exactly this bit rate. It would also be possible to operate Opus in VBR (variable bit rate) mode, in which case this becomes the maximum allowed bit rate (and any unused bits can be made available for other purposes).\n",
|
||||
"The bit rate is chosen to achieve the desired voice quality. Opus recommends a bit rate of 16k to 20k for \"Wideband\" audio (8 kHz bandwidth)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"opus_bitrate = 16000 # bits/second"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Vocoder Frame Duration\n",
|
||||
"\n",
|
||||
"`opus_frame_duration` is the length of the vocoder frame in seconds. This must be chosen from Opus's list of supported frame sizes: 2.5, 5, 10, 20, 40, or 60 ms."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"opus_frame_duration = 20 * 0.001\n",
|
||||
"opus_frame_type1_bytes = int(opus_frame_duration * opus_bitrate) / 8\n",
|
||||
"assert opus_frame_type1_bytes == int(opus_frame_type1_bytes)\n",
|
||||
"opus_frame_type1_bytes = int(opus_frame_type1_bytes)\n",
|
||||
"opus_frame_duration, opus_frame_type1_bytes"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Vocoder Frames per Channel Frame\n",
|
||||
"\n",
|
||||
"`opus_frames_per_opv_frame` is the number of vocoder frames packed into each stream-mode frame defined for Opulent Voice."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"opus_frames_per_opv_frame = 2\n",
|
||||
"opv_frame_duration = opus_frames_per_opv_frame * opus_frame_duration\n",
|
||||
"opv_frame_duration"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### PCM Sample Rate\n",
|
||||
"The input to the voice encoder and the output from the voice decoder are raw streams of 16-bit unsigned integer samples, single channel, at a predetermined sample rate. This rate might vary from implementation to implementation, but 8000 samples/second is a typical value for speech. The rest of the numerology does not depend on the PCM sample rate."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"audio_sample_rate = 8000 # 16-bit samples/second\n",
|
||||
"audio_samples_per_frame = int(audio_sample_rate * opus_frame_duration)\n",
|
||||
"audio_samples_per_frame"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Physical Layer Header\n",
|
||||
"For now, the physical header is a black box of a certain size."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"plheader_type1_bytes = 15 # multiples of 3 fill the Golay code\n",
|
||||
"plheader_type1_bits = plheader_type1_bytes * 8\n",
|
||||
"plheader_type2_bits = plheader_type1_bits * 2 # 12,24 Golay codes, rate 1/2\n",
|
||||
"plheader_type3_bits = plheader_type2_bits # no puncturing for plheader\n",
|
||||
"plheader_type3_bits"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Payload (Vocoder Data in Stream Mode)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"raw_payload_bits = opus_bitrate * opv_frame_duration\n",
|
||||
"payload_type1_bits = raw_payload_bits + 4 # convolutional encoder tail\n",
|
||||
"payload_type2_bits = payload_type1_bits * 2 # Rate 1/2 convolutional code\n",
|
||||
"\n",
|
||||
"# Puncturing is 11/12, in this pattern: 111111111110, so we remove one bit\n",
|
||||
"# for every whole multiple of 12 type2 bits.\n",
|
||||
"num_punctured_bits = math.floor(payload_type2_bits/12)\n",
|
||||
"\n",
|
||||
"# The result may be an inconvenient number. We'd like it to be a multiple of 8.\n",
|
||||
"# So, after puncturing, we may add a few zero bits of padding.\n",
|
||||
"payload_type3_bits = math.ceil((payload_type2_bits - num_punctured_bits)/8)*8\n",
|
||||
"payload_type3_padding = int(payload_type3_bits - (payload_type2_bits - num_punctured_bits))\n",
|
||||
"payload_type3_bits, payload_type3_padding"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Combined Type 3 Frame"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"opv_frame_type3_bits = plheader_type3_bits + payload_type3_bits\n",
|
||||
"opv_frame_type3_bytes = opv_frame_type3_bits / 8\n",
|
||||
"assert opv_frame_type3_bits == opv_frame_type3_bytes * 8\n",
|
||||
"opv_frame_type3_bytes"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Transmitted Symbol Rate for Voice Stream Mode\n",
|
||||
"The combined type3 frame is prefixed with a 16-bit sync word. The transmitted symbol rate is chosen so that the prefixed frames come out at the correct frame rate."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"opv_bit_rate = (16 + opv_frame_type3_bits) / opv_frame_duration\n",
|
||||
"opv_symbol_rate = opv_bit_rate / 2 # 4FSK, so 2 bits per symbol\n",
|
||||
"opv_symbol_rate"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## BERT Mode\n",
|
||||
"To facilitate testing, we have a Bit Error Rate Testing (BERT) mode. A fixed pseudorandom sequence is broken up into frames, with no header added. Thus, the BERT data takes up the space that would be allocated to the plheader as well as the payload in a stream-mode voice frame. A prime number is chosen as the number of bits in each frame, so that each frame is unique over a relatively long period of time.\n",
|
||||
"\n",
|
||||
"The BERT data is convolutionally encoded and punctured in the same way as payload data in stream-mode voice. In order to make this convenient, up to three additional encoder tail bits may be added to bring the encoder input to an integer number of bytes. Then, after puncturing, further padding bits may be needed to bring the BERT-mode frame size to exactly match the stream-mode voice frame size.\n",
|
||||
"\n",
|
||||
"So, the prime number chosen is is the largest prime that will result in a type3 frame size less than or equal to that of a stream-mode voice frame."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"desired_type3_size = opv_frame_type3_bits\n",
|
||||
"\n",
|
||||
"# approximate the type2 size by undoing the puncture ratio\n",
|
||||
"est_type2_size = math.floor((12/11) * desired_type3_size)\n",
|
||||
"\n",
|
||||
"# the actual type2 size must be a multiple of 16, since we want the\n",
|
||||
"# type1 size (including the encoder tail) to be an integer number\n",
|
||||
"# of bytes and then the encoder doubles the length\n",
|
||||
"bert_type2_bits = est_type2_size - (est_type2_size % 16)\n",
|
||||
"\n",
|
||||
"# Puncturing is 11/12, in this pattern: 111111111110, so we remove one bit\n",
|
||||
"# for every whole multiple of 12 type2 bits.\n",
|
||||
"num_punctured_bits = math.floor(bert_type2_bits/12)\n",
|
||||
"bert_punctured_size = bert_type2_bits - num_punctured_bits\n",
|
||||
"bert_postpuncture_padding = desired_type3_size - bert_punctured_size\n",
|
||||
"\n",
|
||||
"type1_size = int(est_type2_size/2) - 4 # un-encode and remove tail bits\n",
|
||||
"bert_preencode_padding = 0\n",
|
||||
"while not sympy.isprime(type1_size):\n",
|
||||
" type1_size -= 1\n",
|
||||
" bert_preencode_padding += 1\n",
|
||||
"bert_bits_per_frame = type1_size\n",
|
||||
"\n",
|
||||
"(bert_bits_per_frame, bert_preencode_padding, bert_postpuncture_padding, bert_type2_bits)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Creating Numerology.h\n",
|
||||
"The following can be copied and pasted into source file `Numerology.h` to define everything we've calculated. Names have been kept from the current draft source code, though they are not wholely consistent."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"print(f' const int opus_bitrate = {opus_bitrate};')\n",
|
||||
"print(f' const int encoded_plheader_size = {plheader_type3_bits};')\n",
|
||||
"print(f' const int punctured_payload_size = {payload_type3_bits};')\n",
|
||||
"print(f' const int frame_size_bits = {opv_frame_type3_bits};')\n",
|
||||
"print(f' const int frame_size_bytes = {opv_frame_type3_bytes};')\n",
|
||||
"print(f' const int audio_sample_rate = {audio_sample_rate};')\n",
|
||||
"print(f' const int audio_frame_size = {audio_samples_per_frame};')\n",
|
||||
"print(f' const int opus_frame_size_bytes = {opus_frame_type1_bytes};')\n",
|
||||
"# wrong in C++ code! print(f' const int audio_payload_bits = {};')\n",
|
||||
"print(f' const int symbol_rate = {opv_symbol_rate};')\n",
|
||||
"print(f' const int bert_bits_per_frame = {bert_bits_per_frame};')\n",
|
||||
"print(f' const int bert_extra_bits = {bert_preencode_padding};')\n",
|
||||
"print(f' const int bert_encoded_size = {bert_type2_bits};')\n",
|
||||
"print(f' const int bert_punctured_size = {bert_punctured_size};')\n",
|
||||
"print(f' const int bert_postpuncture_padding = {bert_postpuncture_padding};')"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3.8.9 64-bit",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.8.10"
|
||||
},
|
||||
"orig_nbformat": 4,
|
||||
"vscode": {
|
||||
"interpreter": {
|
||||
"hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
|
||||
}
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
Ładowanie…
Reference in New Issue