[unixODBC-dev] ANSI to Unicode mapping issues (resend)

Nick Gorham nick at lurcher.org
Fri May 2 08:19:56 BST 2014


On 02/05/14 02:31, David Brown wrote:
> (Pardon me if this is a duplicate - I tried sending it a few days ago 
> from a different address, but it didn't appear to go through)
>
> We have been building and shipping an older ANSI version of our ODBC 
> driver
> (StarSQL) in Unix/Linux environments. We recently ported our current 
> Unicode
> ODBC driver (which has been running on Windows for several years) to 
> Linux,
> and ran into some issues that appear to be related to the unixODBC Driver
> Manager mappings from ANSI entry point to the driver's Unicode entry 
> points
> when an ANSI application invokes ODBC calls to a Unicode driver.
>
> Has anyone else encountered any of these issues?  Thoughts on a solution?
>
> We are using the 2.3.2 release.
>
> Here is a list of the issues encountered by the developer of our driver:
>
> 1)      The Driver Manager does not map calls from an ANSI application's
> call to SQLGet/SetStmtOption to a Unicode driver's SQLGet/SetStmtAttrW 
> entry
> points. It only does the mapping to SQLGet/SetStmtAttr for ANSI 
> drivers. We
> were able to work around this by adding SQLGet/SetStmtOption function 
> entry
> points in our driver, but we shouldn't have to do that.

Will lok at when I get a chance.
>
> 2)      SQLSetDescField does not alter the length supplied by the
> application ("buffer_length") when the field supplied is a string which
> value gets converted to Unicode before being passed to the Unicode 
> Driver.
> In this particular Unicode ODBC API, the buffer_length should be a
> byte-count, not a character-count. The implementation of 
> SQLGetDescField in
> the unixODBC driver manager does deal with this better and divides
> string_length by sizeof(SQLWCHAR) before returning to the application. 
> That
> works better, but is too simplistic for multi-byte ANSI data (e.g. UTF-8)
> See
> #3.

Try 2.3.3pre, that may have been done.
>
> 3)      Conversions between Unicode and ANSI are almost universally 
> assuming
> that one byte of ANSI data will produce two bytes of Unicode data (when
> sizeof(SQLWCHAR) is 2). The code needs to check the length of the 
> resulting
> string (ANSI or Unicode) whenever such a conversion occurs and then 
> use the
> resulting length when passing it on to the driver or calling application.
> Functions like  the ANSI versions of  SQLPrepare and SQLExecDirect can't
> just perform an ansi-to-unicode translation and then pass the application
> supplied length to the Unicode driver.

The A-W and W-A conversions in the driver was meant to be simplistic, 
assuming that unicode drivers would also export the ansi entry point and 
handle their own conversions. The iconv option should allow this, I will 
check, see (1). Remember that when this was written initially, there was 
no case where one byte of ANSI would not produce two bytes of unicode. 
In generally I think its still true, but the use of UTF starts to break 
this.
>
>
>
> Looking at the unixODBC code, it seems clear that we were exposed to 
> similar
> issues
> with our old ANSI driver when called from a Unicode application.
> Applications using parameter markers rather than string literals would be
> less sensitive to the limitations of the current unixODBC driver manager
> implementation since keywords and identifiers are less likely to contain
> "problematic" characters , but it would seem important to address this 
> none
> the less.
>
> Any suggestions would be appreciated.
>
See (1).

-- 
Nick


More information about the unixODBC-dev mailing list