HighTechTalks DotNet Forums  

MultiByteToWideChar in .NET - Multibyte to Unicode conversion

Dotnet Internationalization microsoft.public.dotnet.internationalization


Discuss MultiByteToWideChar in .NET - Multibyte to Unicode conversion in the Dotnet Internationalization forum.



Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old   
AT
 
Posts: n/a

Default MultiByteToWideChar in .NET - Multibyte to Unicode conversion - 11-15-2005 , 10:19 PM






I have a C# application which needs to convert MultiByte strings to
Unicode.
However, I cannot get MultiByteToWideChar to behave as expected within
..net.
I have declared it as follows:

[DllImport("Kernel32", CharSet = CharSet.Auto)]
static extern Int32 MultiByteToWideChar(
UInt32 codePage,
UInt32 dwFlags,
[In, MarshalAs(UnmanagedType.LPStr)] String lpMultiByteStr,
Int32 cbMultiByte,
[Out, MarshalAs(UnmanagedType.LPWStr)] StringBuilder lpWideCharStr,

Int32 cchWideChar);

And am using it as follows:

private string ConvertToUnicode( string str, uint codepage)
{
int l = str.Length;
int i = 0;
i = MultiByteToWideChar( codepage, 0, str, -1, null, 0);
StringBuilder wideStr = new StringBuilder(i);
i = MultiByteToWideChar( codepage, 0, str, -1, wideStr,
wideStr.Capacity);
string s = wideStr.ToString();
return s;
}

If I initialize a C# string with the following bytes: 43, 3A, 5C, 83,
88, 83, 45, 83, 52, 83, 5C, 00 and use the ConvertToUnicode function
above with codepage 932 (Japanese), i get garbage (C:\???E?R?\).
However, using a pure .NET solution (below) I get the correct string
(C:\ヨウコソ):

private string MultibyteToUnicodeNETOnly( string str, int codepage)
{
byte[] source = MCBSToByte(str);
Encoding e1 = Encoding.GetEncoding(codepage);
Encoding e2 = Encoding.Unicode;
byte[] target = Encoding.Convert( e1, e2, source);
return e2.GetString( target);
}

private byte[] MCBSToByte(string s)
{
byte[] b = new byte[s.Length];
int i = 0 ;
foreach( char c in s)
b[ i++] = (byte)c;
return b;
}

Any insights on a way to get MultiByteToWideChar to work, or a better
solution? Thanks in advance.


Reply With Quote
  #2  
Old   
Michael \(michka\) Kaplan [MS]
 
Posts: n/a

Default Re: MultiByteToWideChar in .NET - Multibyte to Unicode conversion - 11-16-2005 , 12:06 AM






What is wrong with the pure managed solution? It is even better and faster
in 2.0....


--
MichKa [Microsoft]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Blog: http://blogs.msdn.com/michkap

This posting is provided "AS IS" with
no warranties, and confers no rights.

<groups (AT) artgs (DOT) com> wrote

I have a C# application which needs to convert MultiByte strings to
Unicode.
However, I cannot get MultiByteToWideChar to behave as expected within
..net.
I have declared it as follows:

[DllImport("Kernel32", CharSet = CharSet.Auto)]
static extern Int32 MultiByteToWideChar(
UInt32 codePage,
UInt32 dwFlags,
[In, MarshalAs(UnmanagedType.LPStr)] String lpMultiByteStr,
Int32 cbMultiByte,
[Out, MarshalAs(UnmanagedType.LPWStr)] StringBuilder lpWideCharStr,

Int32 cchWideChar);

And am using it as follows:

private string ConvertToUnicode( string str, uint codepage)
{
int l = str.Length;
int i = 0;
i = MultiByteToWideChar( codepage, 0, str, -1, null, 0);
StringBuilder wideStr = new StringBuilder(i);
i = MultiByteToWideChar( codepage, 0, str, -1, wideStr,
wideStr.Capacity);
string s = wideStr.ToString();
return s;
}

If I initialize a C# string with the following bytes: 43, 3A, 5C, 83,
88, 83, 45, 83, 52, 83, 5C, 00 and use the ConvertToUnicode function
above with codepage 932 (Japanese), i get garbage (C:\???E?R?\).
However, using a pure .NET solution (below) I get the correct string
(C:\????):

private string MultibyteToUnicodeNETOnly( string str, int codepage)
{
byte[] source = MCBSToByte(str);
Encoding e1 = Encoding.GetEncoding(codepage);
Encoding e2 = Encoding.Unicode;
byte[] target = Encoding.Convert( e1, e2, source);
return e2.GetString( target);
}

private byte[] MCBSToByte(string s)
{
byte[] b = new byte[s.Length];
int i = 0 ;
foreach( char c in s)
b[ i++] = (byte)c;
return b;
}

Any insights on a way to get MultiByteToWideChar to work, or a better
solution? Thanks in advance.



Reply With Quote
Reply




Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.4
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.