2020-02-10 17:52:22 +00:00
|
|
|
# Serialisation
|
|
|
|
|
|
|
|
These notes are a discussion of the serialisation libraries available to
|
|
|
|
MicroPython plus a tutorial on the use of a library supporting Google Protocol
|
|
|
|
Buffers (here abbreviated to `protobuf`). The aim is not to replace official
|
|
|
|
documentation but to illustrate the relative merits and drawbacks of the
|
|
|
|
various approaches.
|
|
|
|
|
|
|
|
##### [Main readme](./README.md)
|
|
|
|
|
|
|
|
# 1. The problem
|
|
|
|
|
|
|
|
The need for serialisation arises whenever data must be stored on disk or
|
|
|
|
communicated over an interface such as a socket, a UART or such interfaces as
|
|
|
|
I2C or SPI. All these require the data to be presented as linear sequences of
|
|
|
|
bytes. The problem is how to convert an arbitrary Python object to such a
|
|
|
|
sequence, and how subsequently to restore the object.
|
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
There are numerous standards for achieving this, five of which are readily
|
|
|
|
available to MicroPython. Each has its own advantages and drawbacks. In two
|
|
|
|
cases the encoded strings aim to be human readable and comprise ASCII
|
|
|
|
characters. In the others they comprise binary `bytes` objects where bytes can
|
|
|
|
take all possible values. The following are the formats with MicroPython
|
|
|
|
support:
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
1. ujson (ASCII, official)
|
|
|
|
2. pickle (ASCII, official)
|
|
|
|
3. ustruct (binary, official)
|
2021-07-29 10:53:52 +00:00
|
|
|
4. MessagePack [binary, unofficial](https://github.com/peterhinch/micropython-msgpack)
|
|
|
|
5. protobuf [binary, unofficial](https://github.com/dogtopus/minipb)
|
|
|
|
|
|
|
|
The `ujson` and `pickle` formats produce human-readable byte sequences. These
|
|
|
|
aid debugging. The use of ASCII data means that a delimiter can be used to
|
|
|
|
identify the end of a message. This is because it is possible to guarantee that
|
|
|
|
the delimiter will never occur within a message. A delimiter cannot be used
|
|
|
|
with binary formats because a message byte can take all possible values
|
|
|
|
including that of the delimiter. The drawback of ASCII formats is inefficiency:
|
|
|
|
the byte sequences are relatively long.
|
|
|
|
|
|
|
|
Numbers 1, 2 and 4 are self-describing: the format includes a definition of its
|
2020-02-10 17:52:22 +00:00
|
|
|
structure. This means that the decoding process can re-create the object in the
|
|
|
|
absence of information on its structure, which may therefore change at runtime.
|
2021-07-29 11:04:18 +00:00
|
|
|
Self describing formats inevitably are variable length. This is no problem
|
|
|
|
where data is being saved to file, but if it is being communicated across a
|
|
|
|
link the receiving process needs a means to determine when a complete message
|
|
|
|
has been received. In the case of ASCII formats a delimiter may be used but in
|
|
|
|
the case of `MessagePack` this presents something of a challenge.
|
2021-07-29 10:53:52 +00:00
|
|
|
|
|
|
|
The `ustruct` format is binary: the byte sequence comprises binary data which
|
|
|
|
is neither human readable nor self-describing. The problem of message framing
|
|
|
|
is solved by hard coding a fixed message structure and length which is known to
|
|
|
|
transmitter and receiver. In simple cases of fixed format data, `ustruct`
|
|
|
|
provides a simple, efficient solution.
|
|
|
|
|
|
|
|
In `protobuf` and `MessagePack` messages are variable length; both can handle
|
|
|
|
data whose length varies at runtime. `MessagePack` also allows the message
|
|
|
|
structure to change at runtime. It is also extensible to enable the efficient
|
|
|
|
coding of additional Python types or instances of user defined classes.
|
|
|
|
|
|
|
|
The `protobuf` standard requires transmitter and receiver to share a schema
|
|
|
|
which defines the message structure. Message length may change at runtime, but
|
|
|
|
structure may not.
|
|
|
|
|
|
|
|
## 1.1 Transmission over unreliable links
|
|
|
|
|
|
|
|
Consider a system where a transmitter periodically sends messages to a receiver
|
|
|
|
over a communication link. An aspect of the message framing problem arises if
|
|
|
|
that link is unreliable, meaning that bytes may be lost or corrupted in
|
|
|
|
transit. In the case of ASCII formats with a delimiter the receiver, once it
|
|
|
|
has detected the problem, can discard characters until the delimiter is
|
|
|
|
received and then wait for a complete message.
|
|
|
|
|
|
|
|
In the case of binary formats it is generally impossible to re-synchronise to a
|
|
|
|
continuous stream of data. In the case of regular bursts of data a timeout can
|
|
|
|
be used. Otherwise "out of band" signalling is required where the receiver
|
|
|
|
signals the transmitter to request retransmission.
|
|
|
|
|
|
|
|
## 1.2 Concurrency
|
|
|
|
|
|
|
|
In `uasyncio` systems the transmitter presents no problem. A message is created
|
|
|
|
using synchronous code, then transmitted using asynchronous code typically with
|
|
|
|
a `StreamWriter`. In the case of ASCII protocols a delimiter - usually `b"\n"`
|
|
|
|
is appended.
|
|
|
|
|
|
|
|
In the case of ASCII protocols the receiver can use `StreamReader.readline()`
|
|
|
|
to await a complete message.
|
2020-02-10 17:52:22 +00:00
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
`ustruct` also presents a simple case in that the number of expected bytes is
|
|
|
|
known to the receiver which simply awaits that number.
|
2020-02-10 17:52:22 +00:00
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
The variable length binary protocols present a difficulty in that the message
|
|
|
|
length is unknown in advance. A solution is available for `MessagePack`.
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
# 2. ujson and pickle
|
|
|
|
|
|
|
|
These are very similar. `ujson` is documented
|
|
|
|
[here](http://docs.micropython.org/en/latest/library/ujson.html). `pickle` has
|
|
|
|
identical methods so this doc may be used for both.
|
|
|
|
|
|
|
|
The advantage of `ujson` is that JSON strings can be accepted by CPython and by
|
|
|
|
other languages. The drawback is that only a subset of Python object types can
|
|
|
|
be converted to legal JSON strings; this is a limitation of the
|
|
|
|
[JSON specification](http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf).
|
|
|
|
|
|
|
|
The advantage of `pickle` is that it will accept any Python object except for
|
|
|
|
instances of user defined classes. The extremely simple source may be found in
|
2021-09-28 06:16:12 +00:00
|
|
|
[the official library](https://github.com/micropython/micropython-lib/tree/master/python-stdlib/pickle).
|
2020-02-10 17:52:22 +00:00
|
|
|
The strings produced are incompatible with CPython's `pickle`, but can be
|
|
|
|
decoded in CPython by using the MicroPython decoder. There is a
|
|
|
|
[bug](https://github.com/micropython/micropython/issues/2280) in the
|
|
|
|
MicroPython implementation when running under MicroPython. A workround consists
|
|
|
|
of never encoding short strings which change repeatedly.
|
|
|
|
|
|
|
|
## 2.1 Usage examples
|
|
|
|
|
|
|
|
These may be copy-pasted to the MicroPython REPL.
|
|
|
|
Pickle:
|
|
|
|
```python
|
|
|
|
import pickle
|
|
|
|
data = {1:'test', 2:1.414, 3: [11, 12, 13]}
|
|
|
|
s = pickle.dumps(data)
|
|
|
|
print('Human readable data:', s)
|
|
|
|
v = pickle.loads(s)
|
|
|
|
print('Decoded data (partial):', v[3])
|
|
|
|
```
|
|
|
|
JSON. Note that dictionary keys must be strings:
|
|
|
|
```python
|
|
|
|
import ujson
|
|
|
|
data = {'1':'test', '2':1.414, '3': [11, 12, 13]}
|
|
|
|
s = ujson.dumps(data)
|
|
|
|
print('Human readable data:', s)
|
|
|
|
v = ujson.loads(s)
|
|
|
|
print('Decoded data (partial):', v['3'])
|
|
|
|
```
|
|
|
|
|
|
|
|
## 2.2 Strings are variable length
|
|
|
|
|
|
|
|
In real applications the data, and hence the string length, vary at runtime.
|
|
|
|
The receiving process needs to know when a complete string has been received or
|
|
|
|
read from a file. In practice `ujson` and `pickle` do not include newline
|
|
|
|
characters in encoded strings. If the data being encoded includes a newline, it
|
|
|
|
is escaped in the string:
|
|
|
|
```python
|
|
|
|
import ujson
|
|
|
|
data = {'1':b'test\nmore', '2':1.414, '3': [11, 12, 13]}
|
|
|
|
s = ujson.dumps(data)
|
|
|
|
print('Human readable data:', s)
|
|
|
|
v = ujson.loads(s)
|
|
|
|
print('Decoded data (partial):', v['1'])
|
|
|
|
```
|
|
|
|
If this is pasted at the REPL you will observe that the human readable data
|
|
|
|
does not have a line break (while the decoded data does). Output:
|
|
|
|
```
|
|
|
|
Human readable data: {"2": 1.414, "3": [11, 12, 13], "1": "test\nmore"}
|
|
|
|
Decoded data (partial): test
|
|
|
|
more
|
|
|
|
```
|
|
|
|
Consequently encoded strings may be separated with `'\n'` before saving and
|
2020-02-10 18:17:37 +00:00
|
|
|
reading may be done using `readline` methods as in this code fragment where
|
|
|
|
`u` is a UART instance:
|
|
|
|
|
|
|
|
```python
|
|
|
|
s = ujson.dumps(data)
|
|
|
|
# pickle produces a bytes object whereas ujson produces a string
|
|
|
|
# In the case of ujson it is probably wise to convert to bytes
|
|
|
|
u.write(s.encode())
|
|
|
|
# Pickle:
|
|
|
|
# u.write(s)
|
|
|
|
u.write(b'\n')
|
|
|
|
|
|
|
|
# receiver
|
|
|
|
s = u.readline()
|
|
|
|
v = ujson.loads(s) # ujson can cope with bytes object
|
|
|
|
# ujson and pickle can cope with trailing newline.
|
|
|
|
```
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
# 3. ustruct
|
|
|
|
|
|
|
|
This is documented
|
|
|
|
[here](http://docs.micropython.org/en/latest/library/ustruct.html). The binary
|
|
|
|
format is efficient, but the format of a sequence cannot change at runtime and
|
|
|
|
must be "known" to the decoding process. Records are of fixed length. If data
|
|
|
|
is to be stored in a binary random access file, the fixed record size means
|
|
|
|
that the offset of a given record may readily be calculated.
|
|
|
|
|
|
|
|
Write a 100 record file. Each record comprises three 32-bit integers:
|
|
|
|
```python
|
|
|
|
import ustruct
|
|
|
|
fmt = 'iii' # Record format: 3 signed ints
|
|
|
|
rlen = ustruct.calcsize(fmt) # Record length
|
|
|
|
buf = bytearray(rlen)
|
|
|
|
with open('myfile', 'wb') as f:
|
|
|
|
for x in range(100):
|
|
|
|
y = x * x
|
|
|
|
z = x * 10
|
|
|
|
ustruct.pack_into(fmt, buf, 0, x, y, z)
|
|
|
|
f.write(buf)
|
|
|
|
```
|
|
|
|
Read record no. 10 from that file:
|
|
|
|
```python
|
|
|
|
import ustruct
|
|
|
|
fmt = 'iii'
|
|
|
|
rlen = ustruct.calcsize(fmt) # Record length
|
|
|
|
buf = bytearray(rlen)
|
|
|
|
rnum = 10 # Record no.
|
|
|
|
with open('myfile', 'rb') as f:
|
|
|
|
f.seek(rnum * rlen)
|
|
|
|
f.readinto(buf)
|
|
|
|
result = ustruct.unpack_from(fmt, buf)
|
|
|
|
print(result)
|
|
|
|
```
|
|
|
|
Owing to the fixed record length, integers must be constrained to fit the
|
|
|
|
length declared in the format string.
|
|
|
|
|
|
|
|
Binary formats cannot use delimiters as any delimiter character may be present
|
|
|
|
in the data - however the fixed length of `ustruct` records means that this is
|
|
|
|
not a problem.
|
|
|
|
|
|
|
|
For performance oriented applications, `ustruct` is the only serialisation
|
|
|
|
approach which can be used in a non-allocating fashion, by using pre-allocated
|
|
|
|
buffers as in the above example.
|
|
|
|
|
|
|
|
## 3.1 Strings
|
|
|
|
|
|
|
|
In `ustruct` the `s` data type is normally prefixed by a length (defaulting to
|
|
|
|
1). This ensures that records are of fixed size, but is potentially inefficient
|
|
|
|
as shorter strings will still occupy the same amount of space. Longer strings
|
|
|
|
will silently be truncated. Short strings are packed with zeros.
|
|
|
|
|
|
|
|
```python
|
|
|
|
import ustruct
|
|
|
|
fmt = 'ii30s'
|
|
|
|
rlen = ustruct.calcsize(fmt) # Record length
|
|
|
|
buf = bytearray(rlen)
|
|
|
|
ustruct.pack_into(fmt, buf, 0, 11, 22, 'the quick brown fox')
|
|
|
|
ustruct.unpack_from(fmt, buf)
|
|
|
|
ustruct.pack_into(fmt, buf, 0, 11, 22, 'rats')
|
|
|
|
ustruct.unpack_from(fmt, buf) # Packed with zeros
|
|
|
|
ustruct.pack_into(fmt, buf, 0, 11, 22, 'the quick brown fox jumps over the lazy dog')
|
|
|
|
ustruct.unpack_from(fmt, buf) # Truncation
|
|
|
|
```
|
|
|
|
Output:
|
|
|
|
```python
|
|
|
|
(11, 22, b'the quick brown fox\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
|
|
|
|
(11, 22, b'rats\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
|
|
|
|
(11, 22, b'the quick brown fox jumps over')
|
|
|
|
```
|
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
# 4. MessagePack
|
|
|
|
|
|
|
|
Of the binary formats this is the easiest to use and can be a "drop in"
|
|
|
|
replacement for `ujson` as it supports the same four methods `dump`, `dumps`,
|
|
|
|
`load` and `loads`. An application might initially be developed with `ujson`,
|
|
|
|
the protocol being changed to `MessagePack` later. Creation of a `MessagePack`
|
|
|
|
string can be done with:
|
|
|
|
```python
|
|
|
|
import umsgpack
|
|
|
|
obj = [1.23, 2.56, 89000]
|
|
|
|
msg = umsgpack.dumps(obj) # msg is a bytes object
|
|
|
|
```
|
|
|
|
Retrieval of the object is as follows:
|
|
|
|
```python
|
|
|
|
import umsgpack
|
|
|
|
# Retrieve the message msg
|
|
|
|
obj = umsgpack.dumps(msg)
|
|
|
|
```
|
|
|
|
An ingenious feature of the standard is its extensibility. This can be used to
|
|
|
|
add support for additional Python types or user defined classes. This example
|
|
|
|
shows `complex` data being supported as if it were a native type:
|
|
|
|
```python
|
|
|
|
import umsgpack
|
|
|
|
from umsgpack_ext import mpext
|
|
|
|
with open('data', 'wb') as f:
|
|
|
|
umsgpack.dump(mpext(1 + 4j), f) # mpext() handles extension type
|
|
|
|
```
|
|
|
|
Reading back:
|
|
|
|
```python
|
|
|
|
import umsgpack
|
|
|
|
import umsgpack_ext # Decoder only needs access to this module
|
|
|
|
with open('data', 'rb') as f:
|
|
|
|
z = umsgpack.load(f)
|
|
|
|
print(z) # z is complex
|
|
|
|
```
|
|
|
|
Please see [this repo](https://github.com/peterhinch/micropython-msgpack). The
|
|
|
|
docs include references to the standard and to other implementations. The repo
|
|
|
|
includes an asynchronous receiver which enables incoming messages to be decoded
|
|
|
|
as they arrive while allowing other tasks to run concurrently.
|
|
|
|
|
|
|
|
# 5. Protocol Buffers
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
This is a [Google standard](https://developers.google.com/protocol-buffers/)
|
|
|
|
described in [this Wikipedia article](https://en.wikipedia.org/wiki/Protocol_Buffers).
|
|
|
|
The aim is to provide a language independent, efficient, binary data interface.
|
|
|
|
Records are variable length, and strings and integers of arbitrary size may be
|
|
|
|
accommodated. The
|
|
|
|
[implementation compatible with MicroPython](https://github.com/dogtopus/minipb)
|
|
|
|
is a "micro" implementation: `.proto` files are not supported. However the data
|
|
|
|
format aims to be a subset of the Google standard and claims compatibility with
|
|
|
|
other platforms and languages.
|
|
|
|
|
|
|
|
The principal benefit to developers using only CPython/MicroPython is its
|
|
|
|
efficient support for fields whose length varies at runtime. To my knowledge it
|
|
|
|
is the sole solution for encoding such data in a compact binary format.
|
|
|
|
|
|
|
|
The following notes should be read in conjunction with the official docs. The
|
|
|
|
notes aim to reduce the learning curve which I found a little challenging.
|
|
|
|
|
|
|
|
In normal use the object transmitted by `minipb` will be a `dict` with entries
|
|
|
|
having various predefined data types. Entries may be objects of variable length
|
|
|
|
including strings, lists and other `dict` instances. The structure of the
|
|
|
|
`dict` is defined using a `schema`. Sender and receiver share the `schema` with
|
|
|
|
each script using it to instantiate the `Wire` class. The `Wire` instance is
|
|
|
|
then repeatedly invoked to encode or decode the data.
|
|
|
|
|
|
|
|
The `schema` is a `tuple` defining the structure of the data `dict`. Each
|
|
|
|
element declares a key and its data type in an inner `tuple`. Elements of this
|
|
|
|
inner `tuple` are strings, with element 0 defining the field's key. Subsequent
|
|
|
|
elements define the field's data type; in most cases the data type is defined
|
|
|
|
by a single string.
|
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
## 5.1 Installation
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
The library comprises a single file `minipb.py`. It has a dependency, the
|
2023-03-06 04:30:45 +00:00
|
|
|
`logging` module `logging.py` and `bisect` module `bisect.py` which may be found in
|
2020-02-10 17:52:22 +00:00
|
|
|
[micropython-lib](https://github.com/micropython/micropython-lib/tree/master/logging).
|
|
|
|
On RAM constrained platforms `minipb.py` may be cross-compiled or frozen as
|
|
|
|
bytecode for even lower RAM consumption.
|
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
## 5.2 Data types
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
These are listed in
|
|
|
|
[the docs](https://github.com/dogtopus/minipb/wiki/Schema-Representations).
|
|
|
|
Many of these are intended to maximise compatibility with the native data types
|
|
|
|
of other languages. Where data will only be accessed by CPython or MicroPython,
|
|
|
|
a subset may be used which maps onto Python data types:
|
|
|
|
|
|
|
|
1. 'U' A UTF8 encoded string.
|
|
|
|
2. 'a' A `bytes` object.
|
|
|
|
3. 'b' A `bool`.
|
|
|
|
4. 'f' A `float` A 32-bit float: the usual MicroPython default.
|
|
|
|
5. 'z' An `int`: a signed arbitrary length integer. Efficiently encoded with
|
|
|
|
an ingenious algorithm.
|
|
|
|
6. 'd' A double precision 64-bit float. The default on Pyboard D SF6. Also on
|
|
|
|
other platforms with special firmware builds.
|
2023-03-06 04:30:45 +00:00
|
|
|
7. 'x' An empty field.
|
2020-02-10 17:52:22 +00:00
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
## 5.2.1 Required and Optional fields
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
If a field is prefixed with `*` it is a `required` field, otherwise it is
|
|
|
|
optional. The field must still exist in the data: the only difference is that
|
|
|
|
a `required` field cannot be set to `None`. Optional fields can be useful,
|
|
|
|
notably for boolean types which can then represent three states.
|
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
## 5.3 Application design
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
The following is a minimal example which can be pasted at the REPL:
|
|
|
|
```python
|
|
|
|
import minipb
|
|
|
|
|
|
|
|
schema = (('value', 'z'),) # Dict will hold a single integer
|
|
|
|
w = minipb.Wire(schema)
|
|
|
|
|
|
|
|
data = {'value': 0}
|
|
|
|
data['value'] = 150
|
|
|
|
tx = w.encode(data)
|
|
|
|
rx = w.decode(tx) # received data
|
|
|
|
print(rx)
|
|
|
|
```
|
|
|
|
This example glosses over the fact that in a real application the data will
|
|
|
|
change and the length of the transmitted string `tx` will vary. The receiving
|
|
|
|
process needs to know the length of each string. Note that a consequence of the
|
|
|
|
binary format is that delimiters cannot be used. The length of each record must
|
|
|
|
be established and made available to the receiver. In the case where data is
|
|
|
|
being saved to a binary file, the file will need an index. Where data is to
|
|
|
|
be transmitted over and interface each string should be prepended with a fixed
|
|
|
|
length "size" field. The following example illustrates this.
|
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
## 5.4 Transmitter/Receiver example
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
These examples can't be cut and pasted at the REPL as they assume `send(n)` and
|
|
|
|
`receive(n)` functions which access the interface.
|
|
|
|
|
|
|
|
Sender example:
|
|
|
|
```python
|
|
|
|
import minipb
|
|
|
|
schema = (('value', 'z'),
|
|
|
|
('float', 'f'),
|
|
|
|
('signed', 'z'),)
|
|
|
|
w = minipb.Wire(schema)
|
|
|
|
# Create a dict to hold the data
|
|
|
|
data = {'value': 0,
|
|
|
|
'float': 0.0,
|
|
|
|
'signed' : 0,}
|
|
|
|
while True:
|
|
|
|
# Update values then encode and transmit them, e.g.
|
|
|
|
# data['signed'] = get_signed_value()
|
|
|
|
tx = w.encode(data)
|
|
|
|
# Data lengths may change on each iteration
|
|
|
|
# here we encode the length in a single byte
|
|
|
|
dlen = len(tx).to_bytes(1, 'little')
|
|
|
|
send(dlen)
|
|
|
|
send(tx)
|
|
|
|
```
|
|
|
|
Receiver example:
|
|
|
|
```python
|
|
|
|
import minipb
|
|
|
|
# schema must match transmitter. Typically both would import this.
|
|
|
|
schema = (('value', 'z'),
|
|
|
|
('float', 'f'),
|
|
|
|
('signed', 'z'),)
|
|
|
|
|
|
|
|
w = minipb.Wire(schema)
|
|
|
|
while True:
|
|
|
|
dlen = receive(1) # Data length stored in 1 byte
|
|
|
|
data = receive(dlen) # Retrieve actual data
|
|
|
|
rx = w.decode(data)
|
|
|
|
# Do something with the received dict
|
|
|
|
```
|
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
## 5.5 Repeating fields
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
This feature enables variable length lists to be encoded. List elements must
|
|
|
|
all be of the same (declared) data type. In this example the `value` and `txt`
|
|
|
|
fields are variable length lists denoted by the `'+'` prefix. The `value` field
|
|
|
|
holds a list of `int` values and `txt` holds strings:
|
|
|
|
```python
|
|
|
|
import minipb
|
|
|
|
schema = (('value', '+z'),
|
|
|
|
('float', 'f'),
|
|
|
|
('txt', '+U'),
|
|
|
|
)
|
|
|
|
w = minipb.Wire(schema)
|
|
|
|
|
|
|
|
data = {'value': [150, 123, 456],
|
|
|
|
'float': 1.23,
|
|
|
|
'txt' : ['abc', 'def', 'ghi'],
|
|
|
|
}
|
|
|
|
tx = w.encode(data)
|
|
|
|
rx = w.decode(tx)
|
|
|
|
print(rx)
|
|
|
|
data['txt'][1] = 'the quick brown fox' # Strings have variable length
|
|
|
|
data['txt'].append('the end') # List has variable length
|
|
|
|
data['value'].append(999) # Variable length
|
|
|
|
tx = w.encode(data)
|
|
|
|
rx = w.decode(tx)
|
|
|
|
print(rx)
|
|
|
|
```
|
2021-07-29 10:53:52 +00:00
|
|
|
### 5.5.1 Packed repeating fields
|
2020-02-10 17:52:22 +00:00
|
|
|
|
2023-03-06 04:30:45 +00:00
|
|
|
This feature reduces some space overhead of encoded message caused by repeatedly
|
|
|
|
emitting field headers as seen in regular repeated fields.
|
|
|
|
|
|
|
|
```python
|
|
|
|
>>> import minipb
|
|
|
|
>>> normal=minipb.Wire('+z')
|
|
|
|
>>> len(normal.encode(range(10000)))
|
|
|
|
31744
|
|
|
|
>>> packed=minipb.Wire('#z')
|
|
|
|
>>> len(packed.encode(range(10000)))
|
|
|
|
21748
|
|
|
|
>>>
|
|
|
|
```
|
|
|
|
|
2020-02-10 17:52:22 +00:00
|
|
|
The author of `minipb` [does not recommend](https://github.com/dogtopus/minipb/issues/6)
|
2023-03-06 04:30:45 +00:00
|
|
|
their use for strings, bytes and nested messages due to compatibility concerns with the
|
|
|
|
official Google Protobuf standards, which disallows such use.
|
2020-02-10 17:52:22 +00:00
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
## 5.6 Message fields (nested dicts)
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
The concept of message fields is a Protocol Buffer notion. In MicroPython
|
|
|
|
terminology a message field contains a `dict` whose contents are defined by
|
|
|
|
another schema. This enables nested dictionaries whose entries may be any valid
|
|
|
|
`protobuf` data type.
|
|
|
|
|
|
|
|
This is illustrated below. The example extends this by making the field a
|
|
|
|
variable length list of `dict` objects (with the `'+['` specifier):
|
|
|
|
```python
|
|
|
|
import minipb
|
|
|
|
# Schema for the nested dictionary instances
|
|
|
|
nested_schema = (('str2', 'U'),
|
|
|
|
('num2', 'z'),)
|
|
|
|
# Outer schema
|
|
|
|
schema = (('number', 'z'),
|
|
|
|
('string', 'U'),
|
2023-03-06 04:30:45 +00:00
|
|
|
('nested', '+[', nested_schema),
|
2020-02-10 17:52:22 +00:00
|
|
|
('num', 'z'),)
|
|
|
|
w = minipb.Wire(schema)
|
|
|
|
|
|
|
|
data = {
|
|
|
|
'number': 123,
|
|
|
|
'string': 'test',
|
|
|
|
'nested': [{'str2': 'string','num2': 888,},
|
|
|
|
{'str2': 'another_string', 'num2': 12345,}, ],
|
|
|
|
'num' : 42
|
|
|
|
}
|
|
|
|
tx = w.encode(data)
|
|
|
|
rx = w.decode(tx)
|
|
|
|
print(rx)
|
|
|
|
print(rx['nested'][0]['str2']) # Access inner dict instances
|
|
|
|
print(rx['nested'][1]['str2'])
|
|
|
|
# Appending to the nested list of dicts
|
|
|
|
data['nested'].append({'str2': 'rats', 'num2':999})
|
|
|
|
tx = w.encode(data)
|
|
|
|
rx = w.decode(tx)
|
|
|
|
print(rx)
|
|
|
|
print(rx['nested'][2]['str2']) # Access inner dict instances
|
|
|
|
```
|
|
|
|
|
2021-07-29 10:53:52 +00:00
|
|
|
### 5.6.1 Recursion
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
This is surely overkill in most MicroPython applications, but for the sake of
|
|
|
|
completeness message fields can be recursive:
|
|
|
|
```python
|
|
|
|
import minipb
|
|
|
|
inner_schema = (('str2', 'U'),
|
|
|
|
('num2', 'z'),)
|
|
|
|
|
2023-03-06 04:30:45 +00:00
|
|
|
nested_schema = (('inner', '+[', inner_schema),)
|
2020-02-10 17:52:22 +00:00
|
|
|
|
|
|
|
schema = (('number', 'z'),
|
|
|
|
('string', 'U'),
|
2023-03-06 04:30:45 +00:00
|
|
|
('nested', '[', nested_schema),
|
2020-02-10 17:52:22 +00:00
|
|
|
('num', 'z'),)
|
|
|
|
|
|
|
|
w = minipb.Wire(schema)
|
|
|
|
|
|
|
|
data = {
|
|
|
|
'number': 123,
|
|
|
|
'string': 'test',
|
|
|
|
'nested': {'inner':({'str2': 'string', 'num2': 888,},
|
|
|
|
{'str2': 'another_string','num2': 12345,}, ),},
|
|
|
|
'num' : 42
|
|
|
|
}
|
|
|
|
tx = w.encode(data)
|
|
|
|
rx = w.decode(tx)
|
|
|
|
print(rx)
|
|
|
|
print(rx['nested']['inner'][0]['str2'])
|
|
|
|
```
|