Previous slide Next slide Back to the first slide

Unicode Implementations Overview

This overview breaks the database vendors into three categories based on chronological availability, followed by the physical data storage characteristics. The physical storage type was chosen because it has a direct bearing on how the data is to be processed (double-byte or multi-byte), and the default operation of the Data Definition Language (DDL). A "CHAR(30) Unicode" statement with UCS-2 will store 30 Unicode characters. This has more efficient processing characteristics, but doubles the space required for Latin-1 data.

A "CHAR(30)" statement with UTF-8 will store between 10 and 30 Unicode characters. This has a minor performance penalty due to "byte counting" but does not significantly increase the space needed for Latin-1 data and can be implemented using existing multibyte schemes. Asian data will increase its data footprint by 50% with UTF-8.

Note that you may not know how the data is processed internally and that a UCS2 datatype does not ensure true Unicode internals.

Unicode available now

Previous slide Next slide Back to the first slide