![]() |
Among the problems that face a systems developer, and to which much of this conference is addressed, is the conversion of text processing internals from a model that assumes 8-bits per character and single-byte null-terminated strings to one that handles Unicode, usually assumed to be proper 16 bit UCS-2 encoding.The obvious solution is to replace character processing internals with all Unicode-enabled routines, but this may not be as simple as it sounds, depending on other technical requirements for that particular developer or architectural limitations of the product. Some may use UTF-8 for a time, others may move directly to UCS-2.
The physical data structures on disk are to a database's anatomy what bones are to a human's anatomy. It is easier to get a heart transplant than it is to get a new rib cage or skull. Hence two requirements appear for existing installations that are to be upgraded: the preservation of legacy data, and minimising the upgrade time.Three scenarios are appearing in the industry to accomplish this: create a separate Unicode-only installation, create a parallel Unicode-only database, or allow Unicode as a separate datatype different from existing character datatypes. Enabling customers to do in-place conversion of existing data must be considered as well.Some vendors have particularly stringent space limitations, so compression of Unicode becomes an issue.
Database providers must decide whether and how to bind Unicode to existing datatypes (int, char, date, etc.) and how to bind existing datatypes back to Unicode. Implicit conversions (what the SQL standard calls "coercibility") as well as explicit conversions must be considered. Since Unicode may be used in a native word format on the CPU, byte-swapping must be considered. Once Unicode is implemented, 3rd party legacy software must still be supported, with Data being sent from a server appropriately converted to datatypes and encodings the older clients can understand. The client software must be able to properly communicate to the server what the session's cultural environment or locale is.Designing to ODBC and JDBC or client interfaces such as Open Client or SQL*Net address many of these issues. Vendors must support their own proprietary data and network protocols.