py/objstr: Add check for valid UTF-8 when making a str from bytes.
This patch adds a function utf8_check() to check for a valid UTF-8 encoded string, and calls it when constructing a str from raw bytes. The feature is selectable at compile time via MICROPY_PY_BUILTINS_STR_UNICODE_CHECK and is enabled if unicode is enabled. It costs about 110 bytes on Thumb-2, 150 bytes on Xtensa and 170 bytes on x86-64.
This commit is contained in:
@@ -33,3 +33,17 @@ try:
|
||||
int('\u0200')
|
||||
except ValueError:
|
||||
print('ValueError')
|
||||
|
||||
# test invalid UTF-8 string
|
||||
try:
|
||||
str(b'ab\xa1', 'utf8')
|
||||
except UnicodeError:
|
||||
print('UnicodeError')
|
||||
try:
|
||||
str(b'ab\xf8', 'utf8')
|
||||
except UnicodeError:
|
||||
print('UnicodeError')
|
||||
try:
|
||||
str(bytearray(b'ab\xc0a'), 'utf8')
|
||||
except UnicodeError:
|
||||
print('UnicodeError')
|
||||
|
||||
Reference in New Issue
Block a user