I have some json files and there're some places with encoded japanese like \u672c\u30fb\u96d1\u8a8c\u30fb\u66f8\u7c4d\u60c5\u5831
in the files, and I want to decode them into japanese.
The problem is when I use this method:
text = '\u672c\u30fb\u96d1\u8a8c\u30fb\u66f8\u7c4d\u60c5\u5831'
print(text)
And it printed
本・雑誌・書籍情報
But when I read it directly from file, for example, the prepared file is index.json and its content is just:
\u672c\u30fb\u96d1\u8a8c\u30fb\u66f8\u7c4d\u60c5\u5831
and the method I used is
file = open('index.json','r')
text = file.read()
print(text)
and it just printed
\u672c\u30fb\u96d1\u8a8c\u30fb\u66f8\u7c4d\u60c5\u5831
One thing I found kinda wierd is that when I tried to print:
print(file.read())
print(text)
The file.read()
returns nothing, even with file.read(1)
.
Edit: I found out that the main problem is when you write text = '\u672c'
, python would recognize \u672c
as a single character. But when you read from a file, then it would recognize it as a string with 6 characters. Anyway to convert it?