Unicode is usually used as a generic term referring to a two-byte character-encoding scheme.
Unicode通常用作涉及双字节字符编码方案的通用术语。
Communications are supported in any language, including double-byte character sets, such as Chinese.
通信支持任何语言,包括双字节字符集,如中国。
For single-byte character set testing (SBCS) — Use German, as this language is lengthy compared to English.
对于单字节字符集(single - byte character set,SBCS)测试:使用德语,因为该语言比英语要长。
Support for a form of multibyte character set (MBCS) called double-byte character set (DBCS) on all platforms.
在所有平台上,支持称为双字节字符集(DBCS)的多字节字符集(MBCS)形式。
Also, CODEUNITS32 specifies that Unicode UTF-32 is used to understand the character boundaries of multi-byte characters.
同样,CODEUNITS32指定使用UnicodeUTF - 32来理解多字节字符的字符边界。
This leads to non-support for the double-byte character input/output used by the Japanese, Korean, and Chinese character sets.
这并不支持由日本,韩国,以及中国字符集使用的双字节字符的输入/输出。
However, these numeric values do not adhere to character semantics in the case of multi-byte character encodings, like UTF-8.
但是对于多字节字符编码(如utf - 8),这些数值并不符合字符语义。
Standard ASCII assigns a single-byte character in the English language to a numeric value, such as 0 to 127 in decimal format.
标准ASCII 为英语语言中的一个单字节字符分配一个数值,比如十进制格式的0到 127。
Therefore, it is important to understand what constitutes a character for writing applications that involve multi-byte character data.
因此,理解字符组成对编写应用程序处理多字节字符数据非常重要。
But since you have a multi-byte character as the first character, you get the result as 3, which is the first occurrence of the search string.
但是由于第一个字符是多字节字符,因此得到结果3,它是搜索字符串的第一次出现的位置。
Figure 4 shows a search for "a", the actual character position of "a" is 2 but the output is 3 because there is a multi-byte character in the string.
图4展示了对“a”的搜索,“a”的实际字符位置是2,而输出的位置是3,原因在于字符串中有多字节字符。
Figure 2 depicts a scenario where data transformation occurs between disparate Single Byte Character Set (SBCS) code pages labeled by the encoding scheme.
图2描述了一个场景,数据转换在又编码计划标记的不同的SingleByte CharacterSet (SBCS)编码页之间发生。
As mentioned above, ratified standards provide for multi-byte character storage and portability; as yet, though, there are no standards for input or rendering.
正如上面介绍的一样,有一些广为认可的标准为多字节存储和可移植性提供了一些便利;然而,现在还没有为输入和显示制定标准。
Recognizing the character as a single, unit as opposed to a sequence of bytes, is a requirement in the case of string manipulations involving multi-byte characters.
将字符看作一个单元而不是一个字节序列,这是进行多字节字符的字符串操作的必要条件。
In Figure 6, shows a search for the character after the third byte, which would have been the second occurrence of the character "a" if all were single byte characters.
在图6中,搜索第三个字节后的字符,如果所有的字符都是单字节字符的话应该搜索到第二次出现的“a”字符。
In the case of a single-byte character encoding scheme, a single byte constitutes a character and the length of a single byte string is the same as the byte length of the string.
对于单字节字符编码模式,一个字节组成一个字符,单字节字符串的长度与字符串的字节长度相同。
First, it can encode a single byte character from any byte-oriented code set, and second, when used in an array, it can encode any multi-byte character from a multi-byte character set such as Unicode.
首先,它可以从面向字节的代码集编码单字节字符,其次,当在数组中使用时,它可以从多字节字符集(如unicode),编码任何多字节字符。
It first explained key concepts, such as character and byte semantics with respect to string data.
首先介绍了一些关键概念,如针对字符串数据的字符语义和字节语义。
No longer can the program assume that one byte is one character, so all data has to be decoded from UTF-8 and encoded back to UTF-8.
程序不再假设一个字节就是一个字符,因此所有的数据都需要从UTF-8 进行解码,然后再重新编码成 UTF-8。
This encoding scheme makes it possible to encode an ASCII character with one byte, and a non-ASCII character with multiple (up to 4) bytes.
这个编码方案可以用一个字节对ASCII字符进行编码,用多个字节(最多4 字节)对非 ASCII 字符进行编码。
The counting of string length using a byte is referred to as byte semantics in this article, and the counting of string length using the number of characters is referred to as character semantics.
本文中将使用字节计算字符串长度的方法称作字节语义,而使用字符数计算字符串长度的方法称作字符语义。
Consider that you have a character in UTF-8 encoding has length of 3 bytes, and the string has only the first two byte of the encoding.
假设您拥有一个UTF - 8编码的字符,其长度为3字节,而字符串只拥有编码的前两个字节。
An individual character is usually encoded using a byte or more, depending upon the encoding used.
单个字符通常使用一个字节或多个字节进行编码,具体情况取决于使用的编码方式。
If you use a nullable VARCHAR column and only ASCII characters (UTF-8 format 1 byte per character) involved, the maximum character length that can be indexed is 1021 characters.
如果使用可空的varchar列并只涉及ASCII字符(在utf - 8格式中每字符1字节),那么可以建立索引的最大字符长度是1021个字符。
Unfortunately, there is no EBCDIC encoder by default, so we'll convert the value to a UTF-16LE byte array (which just adds in a "0" byte for the second byte of each character).
遗憾的是,缺省情况下没有EBCDIC编码器,所以我们将把值转换为utf- 16le字符数组(这只是为每个字符的第二个字节添加一个“0”字节)。
For documents that contain Unicode characters beyond the ASCII range, the parser must read and convert multiple byte sequences for each character.
对于包含ASCII以外的Unicode字符的文档,解析器必须为每个字符读取和转换多字节序列。
The locale setting will cause the % ls format specifier in printf to call the wcsrtombs function in order to convert the wide character argument string into the locale-dependent multi-byte encoding.
语言环境设置会导致printf中的%l s格式说明符调用wcsrtombs函数以便于将宽字符的参数字符串转换成依赖语言环境的多字节编码。
The characters numbered 0 to 0x7f (127) encode to themselves as a single byte, and larger character values are encoded into 2 to 6 bytes.
从0到0x7f(127)的字符把自身编码成单字节,而将值更大的字符编码成2到6个字节。
The 10xxxxxx byte is a continuation byte with the XXXXXX bit positions filled with the bits of the character code number in binary representation.
字节10xxxxxx是一个扩展字节,它的xxxxx x位位置被以二进制表示的字符代码号的位所填充。
The 10xxxxxx byte is a continuation byte with the XXXXXX bit positions filled with the bits of the character code number in binary representation.
字节10xxxxxx是一个扩展字节,它的xxxxx x位位置被以二进制表示的字符代码号的位所填充。
应用推荐