使用规则字符串定义字符排序的顺序。
You use the rule string to define the order in which characters are sorted.
由于是按照字符排序的,因此您可能会对结果感到吃惊。
Since the sort order is by character, you might be surprised at the results.
例如,传统的逐字节比较方式的字符排序表在UTF-8 下也能工作。
For example, the traditional string collation using byte-wise comparison works with UTF-8.
字符排序一般要遵循字典顺序并且需要为每个参与排序的字符赋予特定的排序码。
The character sorting should follow the sort order of dictionary, and need the specially appointed sort codes for each character participated to sort.
另一台、不太奇怪顺序逐个字符排序可能更可取的情况下:如果您的浮点数,它们有的文件的名称。
There's another, less-strange case in which character-by-character sorting may be preferable: if you have file names with floating-point Numbers in them.
您将学习如何显示文本、执行排序、计算单词和行数、转换字符,以及其他任务。
You will learn how to display text, sort it, count words and lines, and translate characters, among other tasks.
排序序列将代码点映射至每个字符在已排序序列中的期望位置。
The collating sequence maps the code point to the desired position of each character in a sorted sequence.
无法通过一组规则实现这个目的,因为无法在希望按照特殊方式排序的字符中包含空格。
You can't do this with a set of rules because you can't include a space as part of the characters you want to sort in a special way.
排序序列是字符集的一种次序,确定每个字符与另一字符相比排列更高、更低或相同。
A collating sequence is an ordering for a set of characters that determines whether each character sorts higher, lower, or the same as another.
幸运的是,sort命令可以按照数字值或字符值进行排序。
Fortunately, the sort command can sort by numeric values or by character values.
另一个要关注的选项是- b,它告知sort忽略空白字符(空格、跳格等等)并将行中的第一个非空白字符当做是排序键的开始。
Another option to watch out for is -b, which tells sort to ignore blank characters (Spaces, tabs, etc.) and treat the first non-blank character on the line as the start of the sort key.
不过,Unicode的确包括用于字符组合和字母排序的文字方向(text direction)和规则。
However, Unicode does include text direction, rules for character combination, and alphabetic ordering.
值lc_ctype定义了字符编码,而LC _ COLLATE定义了排序顺序。
The LC_CTYPE value defines character encoding and LC_COLLATE defines the sorting order.
和西班牙语排序法一样,只要定义比较字符的规则即可。
As with the Spanish collation, you simply define rules that say how to compare characters.
因为默认的UCA不能同时覆盖unicode支持的每种语言的排序规则序列,所以可以使用可选属性定制字符的次序。
Since the default UCA cannot cover the collating sequence of every language supported by Unicode at the same time, the ordering of characters can be customized using optional attributes.
将根据locale的排序序列对匹配的字符执行join命令。
The join is performed on matching characters according to the locale's collating sequence.
我们的排序是按照字符序列进行的,因此uniq 显示的是 “10apple”行,而不是 “1 apple”。
Our sort was by collating sequence, so uniq gives us the "10 apple" line instead of the "1 apple".
在DB 2UDBV8.2之前,Unicode数据库只能定义为排序次序identity,这意味着按照字节编码对字符进行比较。
Prior to DB2 UDB V8.2, a Unicode database could only be defined with the collating sequence IDENTITY, which means that the characters are compared by their byte encoding.
带重音符号的字符通常都是以一种特殊的方式处理;有些语言会将两个字符的特定序列视为一个字母进行排序(比如,捷克语和传统西班牙语中的ch)。
Characters with accents are usually treated in a special way; some languages treat selected sequences of two characters as one letter for sorting (for example, ch in Czech and traditional Spanish).
接着它会根据与搜索字符相衔接的页面数量进行相关网页的排序,衔接数就相当于投票(那些最热门的网站都进行重要性投票,因为它们将更加可靠)。
It then ranks them by the number of pages that link to them, counting links as votes (the most popular sites get weighted votes, since they're more likely to be reliable).
法式属性在对字符串进行排序时会从字符串的末尾开始检查重音。
The French attribute sorts strings by examining the accents starting from the end of the string.
在获得排序的映射后,您可以只提取有限的术语集,最后将它们写出为单个字符串(通过粘合集成代码完成)。
Once you have the ordered map, you can simply extract the limited set of terms, eventually writing them out as a single string (done by the glue integration code).
第二个属性s2指定强度级别,这决定在字符串排序或比较时是否考虑大小写或重音符号。
The second attribute S2 specifies the strength level which determines whether case or accent is taken into account when ordering or comparing strings.
强度属性决定在对文本字符串进行排序或比较时,是否考虑重音或大小写。
The strength attribute determines whether accent or case is taken into account when collating or comparing text strings.
在数据库中使用定制的排序规则可能影响查询性能,因为在选择更宽松的UCA设置时,匹配的字符串数量可能会增加。
Using a customized collation for the database may have an impact on the query performance since the number of possible string matches increases when you choose a looser UCA setting.
对字符串数据进行比较并按一定顺序排列的过程称为排序。
The process of comparing string data and placing it in order is known as collation.
显然使用逐字符比较的标准排序法肯定会说两者不同,因此需要使用自定义的排序法。
Obviously a character-by-character comparison using the standard collation says these are not equal, so you'll need to use a custom collation.
显然,对只包含字符串数据的记录进行排序非常简单。
Clearly, sorting records that consist solely of string data is a piece of cake.
可以使用字符串中每个字符的代码点为基础进行排序,如果样式表只需要对英文数据排序那么这种方法非常合适。
You can use the code points of the individual characters in a string as a basis for collation, and it might work reasonably well if your stylesheets only have to collate data in English.
这个排序器确保所有字符(补充字符和非补充字符)采用与UTF-8一样的二进制排序次序。
This collator ensures all characters, supplementary and non-supplementary, have the same binary collating sequence as UTF-8.
应用推荐