[unixODBC-support] Unicode Support for UCS-2 databases

Nick Gorham nick at lurcher.org
Sat Jul 25 09:03:39 BST 2009


Ingmar Koecher [ NETIKUS.NET ltd ] wrote:
>> -----Original Message-----
>> From: unixodbc-support-bounces at mailman.unixodbc.org 
>> [mailto:unixodbc-support-bounces at mailman.unixodbc.org] On 
>> Behalf Of Nick Gorham
>> Sent: Friday, July 24, 2009 5:55 PM
>> To: Support for the unixODBC project
>> Subject: Re: [unixODBC-support] Unicode Support for UCS-2 databases
>>
>> Ingmar Koecher [ NETIKUS.NET ltd ] wrote:
>>     
>>> Hello,
>>>
>>> I am having some difficulties adding UTF-16 encoded data to UCS-2 
>>> databases (e.g. SQL Server) using unixODBC. Most of the 
>>>       
>> problems seem 
>>     
>>> to appear as soon as I attempt to use any of the W() 
>>>       
>> functions (e.g. 
>>     
>>> SQLDriverConnectW()) opposed to the ASCII counterparts.
>>>
>>> If I read a UTF-8 encoded file on Linux for example, and 
>>>       
>> add it to a 
>>     
>>> MySQL UTF-8 database for example, then it will work and I 
>>>       
>> don't even 
>>     
>>> have to do anything (other than enclosing the field with 
>>>       
>> N''). So this 
>>     
>>> works well.
>>>
>>> If I try to write to a UTF-16/UCS-2 database however, I 
>>>       
>> start having 
>>     
>>> all sorts of problems. If I store a SQL statement:
>>>
>>> wchar_t sqlStmt[] = L"INSERT INTO MyTable (field) values (?)";
>>>
>>> then SQLExecDirectW will complain (or better the database 
>>>       
>> will) as it 
>>     
>>> only sees the "I" characters, the first one. Almost as if it's 
>>> expecting an ASCII string.
>>>
>>> However, at this point I can't even connect using 
>>>       
>> SQLDriverConnectW() 
>>     
>>> when passing a wchar_t string:
>>>
>>> SQLDriverConnectW (hdbc, 0, (SQLWCHAR *) L"MyDsnName", ....);
>>>
>>> as it complains that it's not a valid DSN. My guess is that 
>>>       
>> it's only 
>>     
>>> looking for the first string as well here - or does the 
>>>       
>> odbc.ini file 
>>     
>>> actually need to be UTF-16 encoded? Is there something else that I 
>>> need to do, to get this to work?
>>>
>>> Is there any sample code that shows how to deal with UTF-16/UCS-2 
>>> data, or is this not very common?
>>>
>>> I'm pretty much at a loss here. The biggest problem I 
>>>       
>> cannot seem to 
>>     
>>> resolve, is how I can get UTF-8 data on Non-Windows 
>>>       
>> platforms into a 
>>     
>>> UCS-2 database. I can convert the UTF-8 string into a 
>>>       
>> UTF-16 string, 
>>     
>>> but that's about it.
>>>
>>> I've tried to find information about wchar_t handling of 
>>>       
>> unixODBC, and 
>>     
>>> the ....W() functions, with little success though.
>>>
>>> Any insight that can be provided would be greatly appreciated.
>>>
>>>
>>> Thank you,
>>> Ingmar.
>>>
>>>   
>>>       
>> I would check, but wchar_t is often 32bits not the 16 bits 
>> ODBC expects.
>>
>> -- 
>> Nick
>>     
>
> Thanks Nick. Yes, wchar_t is 4 bytes on OS X (and Linux as well it
> appears), so I suppose that could be an issue. I though however, that
> the ODBC implementation would take that into consideration. I figured,
> that the ....W() functions on OS X / Linux would work correctly with a
> wchar_t, regardless of its storage size.
>
> Am I mistaken?
>   
Yep, sorry, its 16bit as in windows. At least for unixODBC, other driver 
managers seem to vary from 8 to 32 bit. You can build unixODBC to use 4 
byte unicode, but then you need to find drivers that do the same.

-- 
Nick


More information about the unixODBC-support mailing list