URLDecoding Requests
Solution 1:
UnicodeEncodeError: 'ascii' codec can't encode characters
You are trying to decode a string that is Unicode already. It raises AttributeError
on Python 3 (unicode string has no .decode()
method there). Python 2 tries to encode the string into bytes first using sys.getdefaultencoding()
('ascii'
) before passing it to .decode('utf8')
which leads to UnicodeEncodeError
.
In short, do not call .decode()
on Unicode strings, use this instead:
print urllib.unquote(res.url.encode('ascii')).decode('utf-8')
Without .decode()
call, the code prints bytes (assuming a bytestring is passed to unquote()
) that may lead to mojibake if the character encoding used by your environment is not utf-8. To avoid mojibake, always print Unicode (don't print text as bytes), do not hardcode the character encoding of your environment inside your script i.e., .decode()
is necessary here.
There is a bug in urllib.unquote()
if you pass it a Unicode string:
>>> print urllib.unquote(u'%C3%A4')
ä
>>> print urllib.unquote('%C3%A4') # utf-8 output
ä
Pass bytestrings to unquote()
on Python 2.
Post a Comment for "URLDecoding Requests"