![]() | |
![]() |
| | Thread Tools | Search this Thread | Display Modes |
#1
| |||
| |||
|
#2
| |||
| |||
|
|
The application gets some text (from XML) and is supposed to display it. However, this XML contains data from several cultures and some comes from RTL ones (eg. the text is Hebrew). Now I need to find out, wheter I should align the text to the left, or to the right. Is there any function, either in .NET or in Win32 that would determine this for me? I could get the first character and test whether it is Arabic, Hebrew and so on, but I'll likely miss some case (or future one), so I'm looking for more general way of doing that. |
#3
| |||
| |||
|
|
The application gets some text (from XML) and is supposed to display it. However, this XML contains data from several cultures and some comes from RTL ones (eg. the text is Hebrew). Now I need to find out, wheter I should align the text to the left, or to the right. Is there any function, either in .NET or in Win32 that would determine this for me? I could get the first character and test whether it is Arabic, Hebrew and so on, but I'll likely miss some case (or future one), so I'm looking for more general way of doing that. This is how you determine if some culture needs RTL rendering: http://blogs.msdn.com/michkap/archiv...12/663013.aspx But you need to have a way in the XML itself to tag data with a culture. There is no 100% safe way to determine if the text is RTL based on the text content only. Imagine you have a mixture like this: "XXXXX YYYYY" with XXXXX some English text, and YYYYY some Arabic text. Is that English with an Arabic inset, or Arabic with an English inset? -- Mihai Nita [Microsoft MVP, Windows - SDK] http://www.mihai-nita.net ------------------------------------------ Replace _year_ with _ to get the real email |
#4
| |||
| |||
|
|
thank you for answer. However, the Michael's post is expecting to have a CultureInfo. That way, because targeting newer .NET Framework, I could use the CultureInfo.TextInfo.IsRightToLeft. Okay, I know your sample would be a problem. So, how to check it for a single character? Is there any way to test for all RTL cases? Withoug a CultureInfo you can try calling (the native) GetStringTypeEx. |
|
Actually I think I do have ISO-639-2 tag for the text, but I'm not sure whether it is worth to create separate info about textflow with them. |
#5
| |||
| |||
|
|
Withoug a CultureInfo you can try calling (the native) GetStringTypeEx. It takes a locale ID, but you can use whatever you want, The strong attributes in CT_CTYPE2 (C2_RIGHTTOLEFT/C2_LEFTTORIGHT) are not affected by locale. But there is still no reliable way to test for all RTL cases. Sometimes not even a human can do it. |
|
I think most of the time text content is in a single language. A document is mostly in language A, with small chunks of other languages. But those areas have to be tagged. Designing a document where all the languages are mixed, without properly tagging them, is not very usefull. Think MS Word, where you can mark text sections with a different language for spell-checking. |
|
If possible it would be a good idea to tag the documents (if not paragraphs, or records, or whatever) with a full locale ID, RFC 4646 style. There are quite a few things that cannot be done properly without locale info. For example sorting, case conversion are culture sensitive. Font selection (you cannot use a Chinese Traditional font for Chinese Simplified text, even when the text is identical). In fact, unless all you do is move text around (no processing, no display), it is best to know what is the locale of that text. |
#6
| |||
| |||
|
|
... will always be whole (or rarely except a word or two) within the same language. So I can afford to just check the first character in a title for example. If you don't notice any performance hit, try going beyond the first |
|
The problem here is, that I have data in languages which do not match with any existing culture. Like Latin, Old or Middle English and so on, artifical languages not foreclased either. |
#7
| |||
| |||
|
|
Withoug a CultureInfo you can try calling (the native) GetStringTypeEx. It takes a locale ID, but you can use whatever you want, The strong attributes in CT_CTYPE2 (C2_RIGHTTOLEFT/C2_LEFTTORIGHT) are not affected by locale. But there is still no reliable way to test for all RTL cases. Sometimes not even a human can do it. I will give it a try. I just want to avoid (not mentioning that I did not find any way of such checking in .NET) if (char is Arabic || char is Hebrew || char is Urdu || char is Persian || char is Syriac) and forget the Divehi case, or any new culture that will come. I thought that going the way 'if any version of the Windows (or .NET) I am runnig thinks it is RTL I should think it as well' would do the trick. I think most of the time text content is in a single language. A document is mostly in language A, with small chunks of other languages. But those areas have to be tagged. Designing a document where all the languages are mixed, without properly tagging them, is not very usefull. Think MS Word, where you can mark text sections with a different language for spell-checking. Yes I agree, I wanted to mentioned it with your example too. I know the text I'm displaying will always be whole (or rarely except a word or two) within the same language. So I can afford to just check the first character in a title for example. If possible it would be a good idea to tag the documents (if not paragraphs, or records, or whatever) with a full locale ID, RFC 4646 style. There are quite a few things that cannot be done properly without locale info. For example sorting, case conversion are culture sensitive. Font selection (you cannot use a Chinese Traditional font for Chinese Simplified text, even when the text is identical). In fact, unless all you do is move text around (no processing, no display), it is best to know what is the locale of that text. Well fortunately enough, I define the schema here and I could do some changes or improvements. I have set of data coming from different cultures and as Michael has written in the blog and suggested me as well, the user is most likely expecting behaviour based on his culture. So I do sorting of this data and case insensitive searching in context of the user's culture. All I do with data themselves is just to display them. For that reason and because of WPF I need to have an idea, wheter I should mark the document as RTL. The only other reason for knowing CultureInfo I could came up with is the ToTitleCase method, but I expect the titles of documents are already properly cased. The problem here is, that I have data in languages which do not match with any existing culture. Like Latin, Old or Middle English and so on, artifical languages not foreclased either. Filtering data to show only these in Middle English (enm) is far more important to my application than having a CultureInfo for the language, since I need only to display it. This is the reason I choosed ISO-639-2 table instead of .NET supported cultures. If there was a table mapping ISO-639-2 or -3 languages to appropriate CultureInfo classes, even if not accurate, my problems would have been solved. The document could be kept with the ISO marks and the application would get corresponding CultureInfo for properly displaying it. Until then, the GetStringTypeEx would do the work I think. Thank you for your hints and thoughts. Jan |
#8
| |||
| |||
|
|
Maybe calculate a percentage (72% rtl, 12% ltr, 6% others), establish a threshold, and go from there. |
|
If you can control the environment (and it is Vista) you can create your own custom locales. |
#9
| |||
| |||
|
|
Jan, You can use code like in this post: http://blogs.msdn.com/michkap/archiv...6/1421178.aspx or use GetStringTypeW to get the info back. |
#10
| |||
| |||
|
|
"Michael S. Kaplan [MSFT]" <michka (AT) online (DOT) microsoft.com> wrote in message news:eGaaJ7p9HHA.5456 (AT) TK2MSFTNGP05 (DOT) phx.gbl... Jan, You can use code like in this post: http://blogs.msdn.com/michkap/archiv...6/1421178.aspx or use GetStringTypeW to get the info back. Hmmm... thanks for the managed way, Michael! Although I'd have to find a very good reason to leave PInvoke and move to Reflection... ;-) Any improvements in .NET 3.0 or 3.5? Jan |
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
| |