[unixODBC-dev] ANSI to Unicode mapping issues (resend)

David Brown unixodbc at starquest.com
Fri May 2 02:31:27 BST 2014


(Pardon me if this is a duplicate - I tried sending it a few days ago from a 
different address, but it didn't appear to go through)

We have been building and shipping an older ANSI version of our ODBC driver
(StarSQL) in Unix/Linux environments. We recently ported our current Unicode
ODBC driver (which has been running on Windows for several years) to Linux,
and ran into some issues that appear to be related to the unixODBC Driver
Manager mappings from ANSI entry point to the driver's Unicode entry points
when an ANSI application invokes ODBC calls to a Unicode driver.

Has anyone else encountered any of these issues?  Thoughts on a solution?

We are using the 2.3.2 release.

Here is a list of the issues encountered by the developer of our driver:

1)      The Driver Manager does not map calls from an ANSI application's
call to SQLGet/SetStmtOption to a Unicode driver's SQLGet/SetStmtAttrW entry
points. It only does the mapping to SQLGet/SetStmtAttr for ANSI drivers. We
were able to work around this by adding SQLGet/SetStmtOption function entry
points in our driver, but we shouldn't have to do that.

2)      SQLSetDescField does not alter the length supplied by the
application ("buffer_length") when the field supplied is a string which
value gets converted to Unicode before being passed to the Unicode Driver.
In this particular Unicode ODBC API, the buffer_length should be a
byte-count, not a character-count. The implementation of SQLGetDescField in
the unixODBC driver manager does deal with this better and divides
string_length by sizeof(SQLWCHAR) before returning to the application. That
works better, but is too simplistic for multi-byte ANSI data (e.g. UTF-8)
See
#3.

3)      Conversions between Unicode and ANSI are almost universally assuming
that one byte of ANSI data will produce two bytes of Unicode data (when
sizeof(SQLWCHAR) is 2). The code needs to check the length of the resulting
string (ANSI or Unicode) whenever such a conversion occurs and then use the
resulting length when passing it on to the driver or calling application.
Functions like  the ANSI versions of  SQLPrepare and SQLExecDirect can't
just perform an ansi-to-unicode translation and then pass the application
supplied length to the Unicode driver.


Looking at the unixODBC code, it seems clear that we were exposed to similar
issues
with our old ANSI driver when called from a Unicode application.
Applications using parameter markers rather than string literals would be
less sensitive to the limitations of the current unixODBC driver manager
implementation since keywords and identifiers are less likely to contain
"problematic" characters , but it would seem important to address this none
the less.

Any suggestions would be appreciated.

Thanks
David Brown
StarQuest






More information about the unixODBC-dev mailing list