set arabic_rendered_sentence [render_arabic "مرحبا بكم"]Then get in your Tk something very near to the following result.Please just understand I am using the text to explain and imagine it with the spaces are just only one millimeter wide.
مـ ر حـ بـا بـ كمDownload the script here
OR here
The video, Render Arabic Language Words in TCL and Tk (Video Take 2)
, This video on YouTube explains how it works from within the code.If you are planning to do inputs, then do the following steps:- Use a text widget for your input.
- Include to the start of your code the two the procedures mentioned below: render_arabic $arabic_string and text_binding_for_rtl $text_widget_path $k
- Of course, you would be using a number of text widgets. Therefore, make a list of your text widget windows paths like the following:
set list_of_widget_window_paths [list .mytext1 .anotherText2 .toplevel.toplevel_text]
foreach text_widget_window_path $list_of_widget_window_paths {
bind $text_widget_window_path <Key> [list text_binding_for_rtl %W %k]
}And you're done!Render Arabic Language Words in TCL and Tk (Video Take 3)
shows it in action.The Required Procedures edit
render_arabic $arabic_string
makes Arabic readable when displayed in a Tk widget.#! /usr/bin/wish
#Code written by Rani Fayez Ahmad (Superlinux)
#Website: http://www.superlinux.net
#The following procedure is used to extract all ASCII string parts from the Unicode string. in tk_messageBox after rendering to Arabic they come reversed
#So this fix has been added
# It will give the pairs : (ASCII string part | ASCII string starting index in the Unicode string) .
proc list_of_all_ascii_parts_a_unicode_string { arabic_string} {
set ascii_parts_list [list]
set length [string length $arabic_string]
for {set i 0} {$i< $length} {incr i} {
set start_of_ascii $i
set end_of_ascii $start_of_ascii
while {[string is ascii [
string range $arabic_string $start_of_ascii $end_of_ascii]] == 1
&& $i<$length} {
puts [
string range $arabic_string $start_of_ascii $end_of_ascii]
incr i
incr end_of_ascii
}
incr end_of_ascii -1
set ascii_part [
string range $arabic_string $start_of_ascii $end_of_ascii]
if {[string trim $ascii_part] ne {}} {
set ascii_parts_list [
linsert $ascii_parts_list end [list $ascii_part $start_of_ascii]]
}
}
return $ascii_parts_list
}
#a procedure to make Arabic readable when displayed in a Tk widget.
proc render_arabic args {
set arabic_string [lindex $args 0]
set is_messageBox [lindex $args 1]
#The given of the problem is an Arabic sentence
#Break the sentence into words
set words [split [string trim $arabic_string]]
#Display the sentence the way TCL receives it
#The problem is:
#Tcl receives the Arabic letters: (i) in the reverse order (ii)
#disconnected. We want to re-render the Arabic to be displayed correctly
#tk_messageBox -message $words
#$count is the word index in the arabic sentence
set count 0
#the following is just an example of how to get an arabic character index
#number in the unicode character charts
#set z {} ; foreach el [split ل {}] {puts [scan $el %c]}
#foreach word in the arabic sentence
foreach word $words {
if {[string is ascii $word]} {
incr count
continue
}
#else {
# set splits [split $word "!@#$%^&*()_+-=~`123456790/\\"]
# if {[llength $splits] > 1} {
# set split_counter 0
# foreach splitting $splits {
# set splitting [render_arabic $splitting]
# lset splits $split_counter $splitting
# incr split_counter
# }
# set word [join splits]
# incr count
# continue
# }
#}
#1-get the substring in the word without the last letter
#we will deal with the connection of the last letter later
set original_word $word
set sub_word [string range $word 0 end-1]
#All the letters from baa2 to yaa2 when they are NOT the last letter;
#TCL initially has and reads them in their isolated form as in ل م س;
#they must be converted into their initial form e.g ل م س
#so replace and convert every occurrence of each of such letters
#Also other Arabic-like characters like Urdu, Persian, Kurdish... etc,
#You may add them similarly over here
set sub_word [ string map {\u0628 \ufe91} $sub_word] ;#ba2
set sub_word [ string map {\u062A \ufe97} $sub_word] ;#Ta2
set sub_word [ string map {\u062B \ufe9b} $sub_word] ;#thaa2
set sub_word [ string map {\u062C \ufe9f} $sub_word] ;#Jeem
set sub_word [ string map {\u062d \ufea3} $sub_word] ;#7aa2
set sub_word [ string map {\u062e \ufeA7} $sub_word] ;#5aa2
set sub_word [ string map {\u0633 \ufeb3} $sub_word] ;#seen
set sub_word [ string map {\u0634 \ufeb7} $sub_word] ;#sheen
set sub_word [ string map {\u0635 \ufebb} $sub_word] ;#SSaad
set sub_word [ string map {\u0636 \ufebf} $sub_word] ;#DDhahd
set sub_word [ string map {\u0637 \ufec3} $sub_word] ;#TTaa2
set sub_word [ string map {\u0638 \ufec7} $sub_word] ;#tthaa2 Zah
set sub_word [ string map {\u0639 \ufeCb} $sub_word] ;#3eyn
set sub_word [ string map {\u063A \ufeCF} $sub_word] ;#ghyn
set sub_word [ string map {\u0641 \ufeD3} $sub_word] ;#faa2
set sub_word [ string map {\u0642 \ufeD7} $sub_word] ;#quaaf
set sub_word [ string map {\u0643 \ufeDb} $sub_word] ;#kaaf
set sub_word [ string map {\u0644 \ufedf} $sub_word] ;#lam
set sub_word [ string map {\u0645 \ufee3} $sub_word] ;#meem
set sub_word [ string map {\u0646 \ufee7} $sub_word] ;#noon
set sub_word [ string map {\u0647 \ufeeb} $sub_word] ;#haa2
set sub_word [ string map {\u064A \ufef3} $sub_word] ;#yaa2
set sub_word [ string map {\u0626 \ufe8b} $sub_word] ;#hamza 3ala nabera (initial form of yaa2)
#now replace the whole part of the word that excludes the last letter
#with the conversion done above
set word [string replace $word 0 end-1 $sub_word]
#The following list of characters are the characters initial form
#mentioned above + the tatweel chacracter
set initials [list \u0640 \ufe90 \ufe97 \ufe9b \ufe9f \ufea3 \ufeA7 \
\ufb3 \ufeb7 \ufebb \ufebf \ufec3 \ufec7 \ufeCb \ufeCF \ufeD3 \
\ufeD7 \ufeDb \ufedf \ufee3 \ufee7 \ufeeb \ufef3]
#find the character before the last.
set before_last_char [string index $word end-1]
#for debugging purposes just print the character before the last.
## puts $before_last_char
#and try to see if the character before the last is a word in the list
#$initials defined in the previous line.
#and if its true, then convert the last character to it's final linked
#form
#this way they will be joined
if {[lsearch -ascii -inline $initials $before_last_char]
eq $before_last_char} {
#now get also last chacracter
set last_character [string index $word end]
#print it for debugging purposes
##puts $last_character
#just to make sure that we we are matching correctly print the unicode
#index number of the character
##puts [scan $last_character %c]
if {[string is ascii $last_character]} {
set before_last_char [render_arabic $before_last_char]
}
#\u0627 {
# #aleph
# set word [ string replace $word end end \ufe8e ]
#}
#now convert the last character into its final linked form
switch -- $last_character {
\u0628 {
#baa2
set word [string replace $word end end \ufe90]
}
\u0629 {
#taa2 marbootta
set word [string replace $word end end \ufe94]
}
\u062A {
#ta2 maftoo7a
set word [string replace $word end end \ufe96]
}
\u062B {
#thaa2
set word [string replace $word end end \ufe9A]
}
\u062c {
#jeem
set word [string replace $word end end \ufe9e]
puts $word
}
\u062d {
#7aa2
set word [string replace $word end end \ufeA2]
}
\u062e {
#5aa2
set word [string replace $word end end \ufea6]
}
\u062f {
#dal
set word [string replace $word end end \ufeaa]
}
\u0630 {
#tthal
set word [string replace $word end end \ufeac]
}
\u0631 {
#raa2
set word [string replace $word end end \ufeae]
}
\u0632 {
#zyn
set word [string replace $word end end \ufeaf]
}
\u0633 {
#seen
set word [string replace $word end end \ufeb2]
}
\u0634 {
#sheen
set word [string replace $word end end \ufeb6]
}
\u0635 {
#ssaad
set word [string replace $word end end \ufeba]
}
\u0636 {
#ddaad
set word [string replace $word end end \ufebe]
}
\u0637 {
#ttaa2
set word [string replace $word end end \ufec2]
}
\u0638 {
#tthaa2
set word [string replace $word end end \ufec8]
}
\u0639 {
#3ayn
set word [string replace $word end end \ufeca]
}
\u063a {
#ghyn
set word [string replace $word end end \ufece]
}
\u0641 {
#faa2
set word [string replace $word end end \ufed2]
}
\u0642 {
#quaaf
set word [string replace $word end end \ufed6]
}
\u0643 {
#kaaf
set word [string replace $word end end \ufeda]
}
\u0644 {
#laam
set word [ string replace $word end end \ufede ]
}
\u0645 {
#meem
set word [string replace $word end end \ufee2]
}
\u0646 {
#noon
set word [string replace $word end end \ufee6]
}
\u0647 {
#haa2
set word [string replace $word end end \ufeea]
}
\u0648 {
#waaw
set word [string replace $word end end \ufeee]
}
\u0624 {
#waaw with hamza above
set word [ string replace $word end end \ufe86]
}
\u0649 {
#alef maqsura
set word [string replace $word end end \ufef0]
}
\u064a {
#yaa2
set word [string replace $word end end \ufef1]
}
default {
#default is nothing to do
}
}
}
# end of if the character before the last is a member of the list
# $initials
#now reverse every occurrence of the word for correct displaying on the
#screen
set arabic_string [
regsub -all "\\m$original_word\\M" $arabic_string $word]
#add and replace the corrected/conversion-of word with malformed one. in
#the arabic sentence
#the whole words in the sentence yet are still in the reverse order
#lset words $count $word
#move to the next word
incr count
}
#The following 2 line is left for you to see the final result. just remove
#the comment sign (#)
#tk_messageBox -message $words
#puts "before return: $arabic_string \n is_messageBox=$is_messageBox"
#reverse the whole string
set arabic_string [string reverse $arabic_string]
#If you see that the ASCII string parts of the whole Arabic/Unicode are
#reversed, then add another one and only one additional parameter to the
#Arabic/Unicode string and set it only to
#"1" (the number ONE).
if { $is_messageBox ==1 } {
foreach part [list_of_all_ascii_parts_a_unicode_string $arabic_string] {
set part_string [string reverse [ lindex $part 0 ]]
set start_of_ascii [ lindex $part 1 ]
set length_part_string [string length $part_string]
set arabic_string [string replace $arabic_string $start_of_ascii [expr $start_of_ascii + $length_part_string -1] $part_string]
}
}
return $arabic_string
}Usage edit
An example of how to call render_arabic $arabic_string $is_messageBox:#The Arabic sentence in the variable $msg below means in English : The program
#is not available or changeable (it should be here the word "NOT" (غير) instead
#of "CHANGEABLE" (متغير) but for debugging purposes there's an additional
#character) exists/existed
set msg [
render_arabic "الـبـرنـامـج غـيـر مـتـوفـر او مـتـغـيـر مـوجـود حـالئـيـا"]
puts $msg
#If you see that the ASCII string parts of the whole Arabic/Unicode are
#reversed, then add another one and only one additional parameter to the
#Arabic/Unicode string and set it only to #"1" (the number ONE).
#It might not appear to be a second parameter here, but just as you type you'll know it's a second parameter. Just test it and see for yourself
set msg [
render_arabic "الـبـرنـامـج غـيـر مـتـوفـر او مـتـغـيـر مـوجـود حـالئـيـا" 1]
tk_messageBox -detail $msgtext_binding_for_rtl $text_widget_path $k
This procedure will eventually call render_arabic $arabic_string .proc text_binding_for_rtl {text_widget_path k} {
#Caution!!! *DO NOT* delete the next line of [set event_counter 0] .
set event_counter 0
proc local_text_binding_for_rtl {text_widget_path k} {
#Algorithm:
#%k is the keycode number (%k is an integer not a hexadecimal) of the
#pressed key.
#we print it for correct matching and selection of the keys being
#pressed.
#If the %k is neither the spacebar or the Enter don't process anthing.
#We only want to process the very last word.
#And the Enter or the Spacebar are the triggers and the signals just to
#say we have a new word being written in the entry.
#Therefore, exctract the last word. The last word must be checked to
#see whether it's completely an ASCII(Latin) string.
#If it's, then don't touch it and skip to the next word
#Always keep the latin/ASCII words as they are.
#Else, process the word as an arabic word using the procedure
#[render_arabic] defined above
#Also map the last word to the rest of the text widget string just to save time
puts "k= $k"
global event_counter
#Only Key-Enter and Key-Space ban allow the processing of the last word
if {$k != 65 && $k != 36} {
set event_counter 0
return
} else {
incr event_counter
}
#get the whole text as one string
#check whether it's an [entry] widget or if it's an [text] widget
set text_widget_type [winfo class $text_widget_path]
set all_text ""
switch -- $text_widget_type {
Text {
set all_text [$text_widget_path get 1.0 end]
}
Entry {
set all_text [$text_widget_path get ]
}
}
#if the text is empty or full of so many spaces, then return doing
#nothing
set trimmed_text [string trim $all_text]
if {$trimmed_text == {}} {
return
}
if {$event_counter == 1} {
#after trimming white spaces from both ends of the whole text,
#convert the text into a list of words to find the last word in the
#text set words [split [string trim $all_text] ]
#for the sake of debugging and monitoring, print the list of words.
puts $words
#get the last word
set last_word [lindex $words end]
#print the last word for debugging.
puts $last_word
#the word is completely ASCII, skip to the next word by quitting this
#event
if {[string is ascii $last_word]} {
return
}
#we render the last word, which is supposed to be an arabic word
set last_word_after_rendering [render_arabic $last_word]
#print the last word after rendering for debugging
puts $last_word_after_rendering
#replace every occurence of the original last word with the last word
#after rendering in all the text
#but because it everything is already rendered but the last word, it
#will be only a replacement of the last word with
#the last_word_after_rendering
set all_text [
string map [
split "$last_word $last_word_after_rendering"] $all_text]
set all_text "[string trimright $all_text] "
#and finally re-assign the text again to the entry or text widget
switch -- $text_widget_type {
Text {
$text_widget_path replace 1.0 end $all_text
}
Entry {
$text_widget_path delete 0 end
$text_widget_path insert end $all_text
}
}
}
}
local_text_binding_for_rtl $text_widget_path $k
}Usage
An example of how to call text_binding_for_rtl $text_widget_path $k :entry .myent
entry .input
text .mytxt
set list_of_entries [list .myent .mytxt .input]
foreach an_entry $list_of_entries {
bind $an_entry <Key> [list text_binding_for_rtl %W %k]
}[linuxpeter] - 2015-08-22 10:21:51Thank you, Ahmad, for your great program! But it only works with unvowelled text, which is used in non-religious texts. While this text is displayed correctly in a Tk widget: مجانا لك This Bible quotation is not: كِتَابُ مِيلاَدِ يَسُوعَ الْمَسِيحِ ابْنِ دَاوُدَ ابْنِ إِبْراهِيمَ: Can you think of a way to make the prog skip the vowel signs when computing?



