Damien George
7c85c7c210
py/unicode: Fix check for valid utf8 being stricter about contn chars.
2018-11-26 16:13:08 +11:00
Damien George
19aee9438a
py/unicode: Clean up utf8 funcs and provide non-utf8 inline versions.
...
This patch provides inline versions of the utf8 helper functions for the
case when unicode is disabled (MICROPY_PY_BUILTINS_STR_UNICODE set to 0).
This saves code size.
The unichar_charlen function is also renamed to utf8_charlen to match the
other utf8 helper functions, and the signature of this function is adjusted
for consistency (const char* -> const byte*, mp_uint_t -> size_t).
2018-02-14 18:19:22 +11:00
tll
68c28174d0
py/objstr: Add check for valid UTF-8 when making a str from bytes.
...
This patch adds a function utf8_check() to check for a valid UTF-8 encoded
string, and calls it when constructing a str from raw bytes. The feature
is selectable at compile time via MICROPY_PY_BUILTINS_STR_UNICODE_CHECK and
is enabled if unicode is enabled. It costs about 110 bytes on Thumb-2, 150
bytes on Xtensa and 170 bytes on x86-64.
2017-09-06 16:43:09 +10:00
Alexander Steffen
55f33240f3
all: Use the name MicroPython consistently in comments
...
There were several different spellings of MicroPython present in comments,
when there should be only one.
2017-07-31 18:35:40 +10:00
Damien George
afc5063539
py/unicode: Comment-out unused function unichar_isprint.
2016-12-28 17:50:10 +11:00
Alex March
69d9e7d27d
py/repl: Check for an identifier char after the keyword.
...
- As described in the #1850 .
- Add cmdline tests.
2016-02-17 08:56:15 +00:00
Dave Hylands
afaa66b657
py: Minor improvement to unichar_isxdigit
...
This drops the size of unicode_isxdigit from 0x1e + 0x02 filler to
0x14 bytes (so net code reduction of 12 bytes) and will make
unicode_is_xdigit perform slightly faster.
2015-05-20 09:31:22 +01:00
Dave Hylands
3ad94d6072
extmod: Add ubinascii.unhexlify
...
This also pulls out hex_digit from py/lexer.c and makes unichar_hex_digit
2015-05-20 09:29:22 +01:00
Damien George
4dea922610
py: Adjust some spaces in code style/format, purely for consistency.
2015-04-09 15:29:54 +00:00
Damien George
51dfcb4bb7
py: Move to guarded includes, everywhere in py/ core.
...
Addresses issue #1022 .
2015-01-01 20:32:09 +00:00
Damien George
5318cc028a
py: Tidy up a few function declarations.
2014-12-10 22:37:07 +00:00
Damien George
40f3c02682
Rename machine_(u)int_t to mp_(u)int_t.
...
See discussion in issue #50 .
2014-07-03 13:25:24 +01:00
Paul Sokolovsky
9e215fa4c2
py: Make unichar_charlen() accept/return machine_uint_t.
2014-06-28 23:15:29 +03:00
Damien George
e04a44e2f6
py: Small comments, name changes, use of machine_int_t.
2014-06-28 10:27:23 +01:00
Paul Sokolovsky
1044c3dfe6
unicode: Make get_char()/next_char()/charlen() be 8-bit compatible.
...
Based on config define.
2014-06-27 00:04:19 +03:00
Paul Sokolovsky
46d31e9ca9
unicode: Add utf8_ptr_to_index().
...
Useful when we have pointer to char inside string, but need to return char
index. (E.g. str.find()).
2014-06-27 00:04:19 +03:00
Chris Angelico
c88987c1af
py: Implement basic unicode functions.
2014-06-27 00:04:17 +03:00
Paul Sokolovsky
59c675a64c
py: Include mpconfig.h before all other includes.
...
It defines types used by all other headers.
Fixes #691 .
2014-06-21 22:43:22 +03:00
Paul Sokolovsky
b0bb458810
unicode: String API is const byte*.
...
We still have that char vs byte dichotomy, but majority of string operations
now use byte.
2014-06-14 06:22:11 +03:00
Damien George
c59af52e84
py: Rename some unichar functions for consistency.
2014-05-11 17:53:11 +01:00
Paul Sokolovsky
6913521911
objstr: Implement .lower() and .upper().
2014-05-10 19:49:07 +03:00
Damien George
04b9147e15
Add license header to (almost) all files.
...
Blanket wide to all .c and .h files. Some files originating from ST are
difficult to deal with (license wise) so it was left out of those.
Also merged modpyb.h, modos.h, modstm.h and modtime.h in stmhal/.
2014-05-03 23:27:38 +01:00
Damien George
175cecfa87
py: Make form-feed character a space (following C isspace).
...
Eg, in CPython stdlib, email/header.py has a form-feed character.
2014-04-10 11:39:36 +01:00
Paul Sokolovsky
520e2f58a5
Replace global "static" -> "STATIC", to allow "analysis builds". Part 2.
2014-02-12 18:31:30 +02:00
Paul Sokolovsky
0b7184dcb8
Implement octal and hex escapes in strings.
2014-01-22 22:48:25 +02:00
Damien George
8cc96a35e5
Put unicode functions in unicode.c, and tidy their names.
2013-12-30 18:23:50 +00:00