Some characters apparently need long 8-byte encoding
Changed encoding again - now more chars pass as encoded. Also some characters are encoded using long 8-byte notation. The source of the problem seems to be ujson
that converts high+low UTF-16 surrogate sequences into single chars (json
leaves them as pairs of chars which fit in short 4-byte encoding).