py/objstr: Make mp_obj_new_str_of_type check for existing interned qstr.

The function mp_obj_new_str_of_type is a general str object constructor
used in many places in the code to create either a str or bytes object.
When creating a str it should first check if the string data already exists
as an interned qstr, and if so then return the qstr object.  This patch
makes the function have such behaviour, which helps to reduce heap usage by
reusing existing interned data where possible.

The old behaviour of mp_obj_new_str_of_type (which didn't check for
existing interned data) is made available through the function
mp_obj_new_str_copy, but should only be used in very special cases.

One consequence of this patch is that the following expression is now True:

    'abc' is ' abc '.split()[0]
This commit is contained in:
Damien George
2017-11-16 13:53:04 +11:00
parent 4601759bf5
commit 1f1d5194d7
3 changed files with 21 additions and 8 deletions

View File

@@ -417,7 +417,7 @@ STATIC void push_result_token(parser_t *parser, const rule_t *rule) {
pn = mp_parse_node_new_leaf(lex->tok_kind == MP_TOKEN_STRING ? MP_PARSE_NODE_STRING : MP_PARSE_NODE_BYTES, qst);
} else {
// not interned, make a node holding a pointer to the string/bytes object
mp_obj_t o = mp_obj_new_str_of_type(
mp_obj_t o = mp_obj_new_str_copy(
lex->tok_kind == MP_TOKEN_STRING ? &mp_type_str : &mp_type_bytes,
(const byte*)lex->vstr.buf, lex->vstr.len);
pn = make_node_const_object(parser, lex->tok_line, o);