排序字符串,一些项存在但长度为零,一些不见了
Sorting strings where some items are present but have zero length and some are absent
这个例外就是xsl:sort指令,它的lang属性告诉处理程序应该按照和指定语言的文化传统一致的方式排序字符串。
The exception is the XSL: sort instruction, where the lang attribute tells the processor that it should sort strings in a manner consistent with the cultural conventions of the specified language.
所幸的是,intl库提供了collator类(以及名字以collator _开头的影子函数),可用来根据特定的locale对比和排序字符串。
Luckily, the intl library provides the collator class (and shadow functions with names starting with collator_), which you can use to compare and sort strings with respect to your selected locale.
您将学习如何显示文本、执行排序、计算单词和行数、转换字符,以及其他任务。
You will learn how to display text, sort it, count words and lines, and translate characters, among other tasks.
由于是按照字符排序的,因此您可能会对结果感到吃惊。
Since the sort order is by character, you might be surprised at the results.
排序序列将代码点映射至每个字符在已排序序列中的期望位置。
The collating sequence maps the code point to the desired position of each character in a sorted sequence.
无法通过一组规则实现这个目的,因为无法在希望按照特殊方式排序的字符中包含空格。
You can't do this with a set of rules because you can't include a space as part of the characters you want to sort in a special way.
排序序列是字符集的一种次序,确定每个字符与另一字符相比排列更高、更低或相同。
A collating sequence is an ordering for a set of characters that determines whether each character sorts higher, lower, or the same as another.
幸运的是,sort命令可以按照数字值或字符值进行排序。
Fortunately, the sort command can sort by numeric values or by character values.
另一个要关注的选项是- b,它告知sort忽略空白字符(空格、跳格等等)并将行中的第一个非空白字符当做是排序键的开始。
Another option to watch out for is -b, which tells sort to ignore blank characters (Spaces, tabs, etc.) and treat the first non-blank character on the line as the start of the sort key.
不过,Unicode的确包括用于字符组合和字母排序的文字方向(text direction)和规则。
However, Unicode does include text direction, rules for character combination, and alphabetic ordering.
值lc_ctype定义了字符编码,而LC _ COLLATE定义了排序顺序。
The LC_CTYPE value defines character encoding and LC_COLLATE defines the sorting order.
和西班牙语排序法一样,只要定义比较字符的规则即可。
As with the Spanish collation, you simply define rules that say how to compare characters.
因为默认的UCA不能同时覆盖unicode支持的每种语言的排序规则序列,所以可以使用可选属性定制字符的次序。
Since the default UCA cannot cover the collating sequence of every language supported by Unicode at the same time, the ordering of characters can be customized using optional attributes.
将根据locale的排序序列对匹配的字符执行join命令。
The join is performed on matching characters according to the locale's collating sequence.
我们的排序是按照字符序列进行的,因此uniq 显示的是 “10apple”行,而不是 “1 apple”。
Our sort was by collating sequence, so uniq gives us the "10 apple" line instead of the "1 apple".
在DB 2UDBV8.2之前,Unicode数据库只能定义为排序次序identity,这意味着按照字节编码对字符进行比较。
Prior to DB2 UDB V8.2, a Unicode database could only be defined with the collating sequence IDENTITY, which means that the characters are compared by their byte encoding.
带重音符号的字符通常都是以一种特殊的方式处理;有些语言会将两个字符的特定序列视为一个字母进行排序(比如,捷克语和传统西班牙语中的ch)。
Characters with accents are usually treated in a special way; some languages treat selected sequences of two characters as one letter for sorting (for example, ch in Czech and traditional Spanish).
接着它会根据与搜索字符相衔接的页面数量进行相关网页的排序,衔接数就相当于投票(那些最热门的网站都进行重要性投票,因为它们将更加可靠)。
It then ranks them by the number of pages that link to them, counting links as votes (the most popular sites get weighted votes, since they're more likely to be reliable).
法式属性在对字符串进行排序时会从字符串的末尾开始检查重音。
The French attribute sorts strings by examining the accents starting from the end of the string.
在获得排序的映射后,您可以只提取有限的术语集,最后将它们写出为单个字符串(通过粘合集成代码完成)。
Once you have the ordered map, you can simply extract the limited set of terms, eventually writing them out as a single string (done by the glue integration code).
第二个属性s2指定强度级别,这决定在字符串排序或比较时是否考虑大小写或重音符号。
The second attribute S2 specifies the strength level which determines whether case or accent is taken into account when ordering or comparing strings.
强度属性决定在对文本字符串进行排序或比较时,是否考虑重音或大小写。
The strength attribute determines whether accent or case is taken into account when collating or comparing text strings.
在数据库中使用定制的排序规则可能影响查询性能,因为在选择更宽松的UCA设置时,匹配的字符串数量可能会增加。
Using a customized collation for the database may have an impact on the query performance since the number of possible string matches increases when you choose a looser UCA setting.
对字符串数据进行比较并按一定顺序排列的过程称为排序。
The process of comparing string data and placing it in order is known as collation.
显然使用逐字符比较的标准排序法肯定会说两者不同,因此需要使用自定义的排序法。
Obviously a character-by-character comparison using the standard collation says these are not equal, so you'll need to use a custom collation.
显然,对只包含字符串数据的记录进行排序非常简单。
Clearly, sorting records that consist solely of string data is a piece of cake.
可以使用字符串中每个字符的代码点为基础进行排序,如果样式表只需要对英文数据排序那么这种方法非常合适。
You can use the code points of the individual characters in a string as a basis for collation, and it might work reasonably well if your stylesheets only have to collate data in English.
这个排序器确保所有字符(补充字符和非补充字符)采用与UTF-8一样的二进制排序次序。
This collator ensures all characters, supplementary and non-supplementary, have the same binary collating sequence as UTF-8.
另一方面,EBCDIC中的排序序列则是:空格、小写字符、大写字符和数字值。
On the other hand, the collating sequence in EBCDIC is: space, lower case characters, upper case characters, and numeric values.
应用推荐