Python-3 And \x Vs \u Vs \u In String Encoding And Why
Solution 1:
Python gives you a representation of the string, and for non-printable characters will use the shortest available escape sequence.
\x80
is the same character as \u0080
or \U00000080
, but \x80
is just shorter. For chr(57344)
the shortest notation is \ue000
, you can't express the same character with \xhh
, that notation only can be used for characters up to \0xFF
.
For some characters there are even single-letter escapes, like \n
for a newline, or \t
for a tab.
Python has multiple notation options for historical and practical reasons. In a byte string you can only create bytes in the range 0 - 255, so there \xhh
is helpful and more concise than having to use \U000hhhhh
everywhere when you can't even use the full range available to that notation, and \xhh
and \n
and related codes are familiar to programmers from other languages.
Post a Comment for "Python-3 And \x Vs \u Vs \u In String Encoding And Why"