Python converting code page character number to unicode

2697 views python
-1

By default, print(chr(195)) displays the unicode character at position 195 ("Ã") How do I print chr(195) that appears in code page 1251, ie. "?" I tried: print(chr(195).decode('cp1252')), and various .encode methods.

answered question

2 Answers

5

You can use urllib

print urllib.quote_plus(str.encode('cp1251'))

Also remember, if you are using international strings, make sure to include the u prefix in your string that you are parsing.

str = u"whateverhere"

posted this
7

Since you cannot store a 'raw' value 0xC3 in a string (and if you did, you should not have – raw binary "unparsed" data should be a byte array): the proper way to convert from a raw byte array is indeed .decode('cp1251'):

>>> print (b'\xc3'.decode('cp1251'))
?

However, if you already got it in a string, then the easiest is to first convert from a string to a bytes object using the 1-on-1 "encoding" Latin-1:

str = 'Ãamma'
print (bytes(str.encode('latin1')).decode('cp1251'))
>>> ?amma

posted this

Have an answer?

JD

Please login first before posting an answer.