HighTechTalks DotNet Forums  

Is there a way to get the character that cannot be converted to an encoding?

Dotnet Internationalization microsoft.public.dotnet.internationalization


Discuss Is there a way to get the character that cannot be converted to an encoding? in the Dotnet Internationalization forum.



Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old   
Benny
 
Posts: n/a

Default Is there a way to get the character that cannot be converted to an encoding? - 02-28-2006 , 01:11 AM






When using StreamReader to reader a file, the character that cannot be
converted to an encoding will be changed to a replace char ,like '?'.
Is there a way to know what the original char(bytes) is?


For example
My file has following lines
・ 名詞相当語句:
・ think it’s ~ t :
・名詞

the hex value:
EC 7D F6 BE 81 40 ......
EC 59 81 40......
81 45 .....

The original code for three '・' are differenet.
EC7D , EC59 and 8145

But when using following code to read,
they are all changed to '・' (unicode: 0x30FB, shift-jis: 8145).

System.IO.StreamReader sr = new
System.IO.StreamReader(@"C:\test.txt",Encoding.Get Encoding("shift-jis"));
string line = sr.ReadLine();
line = sr.ReadLine();

Is there a way to know what is the original value for the '・' ?

Reply With Quote
  #2  
Old   
Mihai N.
 
Posts: n/a

Default Re: Is there a way to get the character that cannot be converted to an encoding? - 02-28-2006 , 11:27 PM






Quote:
When using StreamReader to reader a file, the character that cannot be
converted to an encoding will be changed to a replace char ,like '?'.
Is there a way to know what the original char(bytes) is?
Not easy.
An idea is to read the file using the encoding (and you get a Unicode
string), then convert back to the original encoding and compare with the
original, at byte level.
But when you do such low level stuff, the ease-of-use of something high-level
like StreamReader is gone.


--
Mihai Nita [Microsoft MVP, Windows - SDK]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email


Reply With Quote
Reply




Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.