V0.3 Sparse index implemented.

2019-09-08 17:25:51 +01:00 · 2019-09-08 17:25:51 +01:00 · 52d060737d
commit 52d060737d
--- a/FONT_TO_PY.md
+++ b/FONT_TO_PY.md
@ -1,18 +1,28 @@
 # font_to_py.py

-Convert a font file to Python source code. The principal reason for doing this
-is to save RAM on resource-limited targets: the font file may be incorporated
-into a firmware build such that it occupies flash memory rather than scarce
-RAM. Python code built into firmware is known as frozen bytecode.
+Convert a font file to Python source code. Python font files provide a much
+faster way to access glyphs than the principal alternative which is a random
+access file on the filesystem.
+
+Another benefit is that they can save large amounts of RAM on resource-limited
+targets: the font file may be incorporated into a firmware build such that it
+occupies flash memory rather than scarce RAM. Python code built into firmware
+is known as frozen bytecode.

 ## V0.3 notes

 8 Sept 2019

-Remove redundancy from index file. Emit extra index for sparse fonts, reducing
-code size. Add comment field in the output file showing creation command line.
-Repo includes the file `extended`. This facilitates creating fonts comprising
-the printable ASCII set plus `°μπωϕθαβγδλΩ`. Improvements to `font_test.py`.
+ 1. Reduced output file size for sparse fonts. These result from large gaps
+ between ordinal values of Unicode characters not in the standard ASCII set.
+ 2. Output file has comment showing creation command line.
+ 3. Repo includes the file `extended`. Using `-k extended` creates fonts
+ comprising the printable ASCII set plus `°μπωϕθαβγδλΩ`. Such a font has 95
+ chars having ordinal values from 32-981.
+ 4. Improvements to `font_test.py`.
+
+Python files produced are interchangeable with those from prior versions: the
+API is unchanged.

 ###### [Main README](./README.md)

@ -44,9 +54,11 @@ Further arguments ensure that the byte contents and layout are correct for the
 target display hardware. Their usage should be specified in the documentation
 for the device driver.

-Example usage to produce a file `myfont.py` with height of 23 pixels:  
-`font_to_py.py FreeSans.ttf 23 myfont.py`
-
+Examples of usage to produce a file `myfont.py` with height of 23 pixels:
+```shell
+$ font_to_py.py FreeSans.ttf 23 myfont.py
+$ font_to_py.py -k extended FreeSans.ttf 23 my_extended_font.py
+```
 ## Arguments

 ### Mandatory positional arguments:
@ -72,9 +84,10 @@ Example usage to produce a file `myfont.py` with height of 23 pixels:
 * -k or --charset_file Obtain the character set from a file. Typical use is
 for alternative character sets such as Cyrillic: the file must contain the
 character set to be included. An example file is `cyrillic`. Another is 
- `extended` which adds unicode characters "° μ π ω ϕ θ α β γ δ λ Ω" to those
- with `ord` values from 32-126. Such files will only produce useful results if
- the source font file includes those glyphs.
+ `extended` which adds unicode characters `°μπωϕθαβγδλΩ` to those in the
+ original ASCII set of printable characters. At risk of stating the obvious
+ this will only produce useful results if the source font file includes all
+ specified glyphs.

 The -c option may be used to reduce the size of the font file by limiting the
 character set. If the font file is frozen as bytecode this will not reduce RAM
@ -194,13 +207,15 @@ print(len(freesans20._font) + len(freesans20._index))

 The memory used was 1712, 2032, 2384 and 2416 bytes. As increments over the
 prior state this corresponds to 320, 352 and 32 bytes. The `print` statement
-shows the RAM which would be consumed by the data arrays: this was 3956 bytes
-for `freesans20`.
+shows the RAM which would be consumed by the data arrays if they were not
+frozen: this was 3956 bytes for `freesans20`.

 The `foo()` function emulates the behaviour of a device driver in rendering a
 character to a display. The local variables constitute memory which is
 reclaimed on exit from the function. Its additional RAM use was 16 bytes.

+Similar figures were found in recent (2019) testing on a Pyboard D.
+
 ## Conclusion

 With a font of height 20 pixels RAM saving was an order of magnitude. The
--- a/font_to_py.py
+++ b/font_to_py.py
@ -36,8 +36,6 @@ import freetype

 MINCHAR = 32  # Ordinal values of default printable ASCII set
 MAXCHAR = 126  # 94 chars
-# By default there will be 94 ASCII characters + the default char in element[0]
-ASSUME_SPARSE = MAXCHAR - MINCHAR + 1

 # UTILITIES FOR WRITING PYTHON SOURCECODE TO A FILE

@ -356,23 +354,27 @@ class Font(dict):
            data += (width).to_bytes(2, byteorder='little')
            data += bytearray(self.stream_char(char, hmap, reverse))

-        for n, char in enumerate(self.charset):
-            # n = 1 + ord(char) - ord(smallest char in set)
-            # Build normal index for default char + 1st 94 chars. Efficient for
-            # ASCII set.
-            if n <= ASSUME_SPARSE:
+        # self.charset is contiguous with chars having ordinal values in the
+        # inclusive range specified. Where the specified character set has gaps
+        # missing characters are empty strings.
+        # Charset includes default char and both max and min chars, hence +2.
+        if len(self.charset) <= MAXCHAR - MINCHAR + 2:
+            # Build normal index. Efficient for ASCII set and smaller as
+            # entries are 2 bytes.
+            for char in self.charset:
                if char == '':
                    index += bytearray((0, 0))
                else:
                    index += (len(data)).to_bytes(2, byteorder='little')  # Start
                    append_data(data, char)
-            elif char != '':
-                # Build sparse index. Entries are 4 bytes but only populated if
-                # the char is in the charset.
+            index += (len(data)).to_bytes(2, byteorder='little')  # End
+        else:
+            # Sparse index. Entries are 4 bytes but only populated if the char
+            # is in the charset.
+            for char in [c for c in self.charset if c]:
                sparse += ord(char).to_bytes(2, byteorder='little')
                sparse += (len(data)).to_bytes(2, byteorder='little')  # Start
                append_data(data, char)
-        index += (len(data)).to_bytes(2, byteorder='little')  # End
        return data, index, sparse

    def build_binary_array(self, hmap, reverse, sig):
@ -384,8 +386,8 @@ class Font(dict):
        return data

 # PYTHON FILE WRITING
-# Owing to sparse charsets and an index which only holds the atart of data,
-# can't read next_offset but must calculate it
+# The index only holds the start of data so can't read next_offset but must
+# calculate it.

 STR01 = """# Code generated by font-to-py.py.
 # Font: {}{}
@ -394,49 +396,43 @@ version = '0.3'

 """

-# Code emitted for charsets comprising <= 95 chars (including default)
+# Code emitted for charsets spanning a small range of ordinal values
 STR02 = """_mvfont = memoryview(_font)

 def get_ch(ch):
-    ordch = ord(ch)
-    if ordch >= {0} and ordch <= {1}:
-        idx_offs = 2 * (ordch - {0} + 1)
-    else:
-        idx_offs = 0
+    oc = ord(ch)
+    idx_offs = 2 * (oc - {0} + 1) if oc >= {0} and oc <= {1} else 0
    offset = int.from_bytes(_index[idx_offs : idx_offs + 2], 'little')
    width = int.from_bytes(_font[offset:offset + 2], 'little')
 """

 # Code emiited for large charsets, assumed by build_arrays() to be sparse
+# Binary sort of sparse index.
 STRSP = """_mvfont = memoryview(_font)
 _mvsp = memoryview(_sparse)

-def bins(lst, val):
+def bs(lst, val):
    n = len(lst) // 4
    if n == 1:
        v = int.from_bytes(lst[: 2], 'little')
        return int.from_bytes(lst[2 : 4], 'little') if v == val else 0
    sp = (n // 2) * 4
-    res = bins(lst[: sp], val)
-    return res if res else bins(lst[sp :], val)
+    res = bs(lst[: sp], val)
+    return res if res else bs(lst[sp :], val)

 def get_ch(ch):
-    ordch = ord(ch)
-    if ordch < {1}:
-        idx_offs = 2 * (ordch - {0} + 1) if ordch >= {0} else 0
-        offset = int.from_bytes(_index[idx_offs : idx_offs + 2], 'little')
-    else:
-        offset = bins(_mvsp, ordch)
+    offset = bs(_mvsp, ord(ch))
    width = int.from_bytes(_font[offset : offset + 2], 'little')
 """

-
+# Code emitted for horizontally mapped fonts.
 STR02H ="""
    next_offs = offset + 2 + ((width - 1)//8 + 1) * {0}
    return _mvfont[offset + 2:next_offs], {0}, width
 
 """

+# Code emitted for vertically mapped fonts.
 STR02V ="""
    next_offs = offset + 2 + (({0} - 1)//8 + 1) * width
    return _mvfont[offset + 2:next_offs], {0}, width
@ -460,6 +456,7 @@ def write_font(op_path, font_path, height, monospaced, hmap, reverse, minchar, m
        return False
    return True

+# Extra code emitted where -i is specified.
 STR03 = '''
 def glyphs():
    for c in """{}""":
@ -489,15 +486,15 @@ def write_data(stream, fnt, font_path, hmap, reverse, iterate):
    bw_font = ByteWriter(stream, '_font')
    bw_font.odata(data)
    bw_font.eot()
-    bw_index = ByteWriter(stream, '_index')
-    bw_index.odata(index)
-    bw_index.eot()
    if sparse:  # build_arrays() has returned a sparse index
        bw_sparse = ByteWriter(stream, '_sparse')
        bw_sparse.odata(sparse)
        bw_sparse.eot()
-        stream.write(STRSP.format(minchar, minchar + ASSUME_SPARSE, len(sparse)))
+        stream.write(STRSP)
    else:
+        bw_index = ByteWriter(stream, '_index')
+        bw_index.odata(index)
+        bw_index.eot()
        stream.write(STR02.format(minchar, maxchar))
    if hmap:
        stream.write(STR02H.format(height))