18

I am using Python 2.7 and i try to print Arabic strings like these

print "ذهب الطالب الى المدرسة"

it's give the following output:

ط°ظ‡ط¨ ط§ظ„ط·ط§ظ„ط¨ ط§ظ„ظ‰ ط§ظ„ظ…ط¯ط±ط³ط©

The purpose is to print the text correctly, and not how to print each line. So, how can I print the string or content of text file correctly in its original form? like:

ذهب الطالب الى المدرسة
Adriaan
  • 17,741
  • 7
  • 42
  • 75
Mohammed Sy
  • 240
  • 1
  • 4
  • 14
  • 3
    Better switch to Python 3.5 – ForceBru Dec 20 '16 at 13:15
  • 1
    @ForceBru "Python3.6 is the one which is worthy of being called Python3" --Raymond Hettinger. So switch to Python3.6 – Mohammad Yusuf Dec 20 '16 at 13:26
  • 1
    The issue might not be with Python but with the terminal emulator that you are using. If you type `echo ذهب` in the terminal and press enter, does it print the Arabic word as you expect? – Flimm Feb 23 '22 at 14:41

8 Answers8

20

by this module you can correct your text shape an direction. just install pips and use it.

# install: pip install --upgrade arabic-reshaper
import arabic_reshaper

# install: pip install python-bidi
from bidi.algorithm import get_display

text = "ذهب الطالب الى المدرسة"
reshaped_text = arabic_reshaper.reshape(text)    # correct its shape
bidi_text = get_display(reshaped_text)           # correct its direction
Jalal Razavi
  • 203
  • 3
  • 8
  • This would work with one line only using the bidi approach and then it would go ahead and reverse the RTL Arabic Text in the Paragraph, refer to this approach using bidi and arabic-reshaper https://stackoverflow.com/questions/67661330/how-to-fix-the-reversed-lines-when-using-arabic-reshaper-and-python-bidi-in-mul – Muneeb Ahmad Khurram Feb 10 '22 at 09:14
7

The following code works:

import arabic_reshaper

text_to_be_reshaped =  'اللغة العربية رائعة'

reshaped_text = arabic_reshaper.reshape(text_to_be_reshaped)

rev_text = reshaped_text[::-1]  # slice backwards 

print(rev_text)
Adriaan
  • 17,741
  • 7
  • 42
  • 75
  • Please read [answer] and [edit] your answer to contain an explanation as to why this code would actually solve the problem at hand. Always remember that you're not only solving the problem, but are also educating the OP and any future readers of this post. – Adriaan Sep 27 '22 at 13:24
5

Try this:

print u"ذهب الطالب الى المدرسة"

Output:

ذهب الطالب الى المدرسة

Demo: https://repl.it/EuHM/0

The default Python2.7 string works with utf-8 character set. And arabic is not included inside utf-8. So if you prefix it with u then it will treat that string as unicode string.

Mohammad Yusuf
  • 16,554
  • 10
  • 50
  • 78
  • i got this : UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) – Mohammed Sy Dec 20 '16 at 13:16
  • @MohammedSy What is the source of that string? I'm using Python2.7 as well but I'm not getting that error. – Mohammad Yusuf Dec 20 '16 at 13:22
  • i wrote the string from my keyboard , i think it's not problem solve with code, maybe i got this problem from windows encoding – Mohammed Sy Dec 20 '16 at 13:26
  • Try something like this: `print "ذهب الطالب الى المدرسة".encode('utf-8','ignore')` – Mohammad Yusuf Dec 20 '16 at 13:44
  • @MohammedSy maybe you need `# coding=utf-8` as first line, I had that problem trying to recreate this answer on repl.it: https://repl.it/@dralletje/Arabic-String#main.py – Michiel Dral Sep 13 '20 at 18:19
2
import sys
text = "اطبع هذا النص".encode("utf-8")

or

text = "اطبع هذا النص".encode()

then

sys.stdout.buffer.write(text)

output

"اطبع هذا النص"
0

In python 2.7

at the very top of your file you can declare:

# -*- coding: utf-8 -*-
print "ذهب الطالب الى المدرسة"

Updated:

If you can run this:

# -*- coding: utf-8 -*-
s = "ذهب الطالب الى المدرسة"
with open("file.txt", "w", encoding="utf-8") as myfile:
    myfile.write(s)

And the file generated "file.txt" contains the correct string then it is a problem with whatever you are displaying in in not python itself, I guess you could try displaying it in something else, maybe even PyQt.

Dan-Dev
  • 8,957
  • 3
  • 38
  • 55
  • the same problem : ط°ظ‡ط¨ ط§ظ„ط·ط§ظ„ط¨ ط§ظ„ظ‰ ط§ظ„ظ…ط¯ط±ط³ط© – Mohammed Sy Dec 20 '16 at 13:30
  • By what is Python's print output being rendered? Python may be correctly emitting the unicode, but that doesn't mean that whatever translates it into pixels and symbols on a screen gets it right. – nigel222 Dec 20 '16 at 14:48
0

You need to add some line before your code

import sys
reload(sys)
sys.setdefaultencoding('utf-8')  
print "ذهب الطالب الى المدرسة"
khelili miliana
  • 3,730
  • 2
  • 15
  • 28
0

You can either prefix your string with u like this

print u"ذهب الطالب الى المدرسة"

or make yourself compatible with python3 and put this in the top of your file

from __future__ import unicode_literals

Python27 strings (or bytestrings as they're known in Python3) do not handle unicode characters. Both the u and the import statement make your string unicode compatible.

yorodm
  • 4,359
  • 24
  • 32
0

You have two problems ... first you are using non Arabic font or non Unicode text ... and second you need a function like this to mix pure Arabic letters and gives you mixed Arabic letters:

def mixARABIC(string2):
    import unicodedata
    string2 = string2.decode('utf8')
    new_string = ''
    for letter in string2:
        if ord(letter) < 256: unicode_letter = '\\u00'+hex(ord(letter)).replace('0x','')
        elif ord(letter) < 4096: unicode_letter = '\\u0'+hex(ord(letter)).replace('0x','')
        else: unicode_letter = '\\u'+unicodedata.decomposition(letter).split(' ')[1]
        new_string += unicode_letter
    new_string = new_string.replace('\u06CC','\u0649')
    new_string = new_string.decode('unicode_escape')
    new_string = new_string.encode('utf-8')
    return new_string