- 论坛徽章:
- 1
|
本帖最后由 chenzhanyiczy 于 2013-10-31 15:59 编辑
man regex
Within a bracket expression, a collating element (a character, a multi-character sequence that collates as if it were a single character, or a collating-sequence name
for either) enclosed in "[." and ".]" stands for the sequence of characters of that collating element. The sequence is a single element of the bracket expression’s
list. A bracket expression containing a multi-character collating element can thus match more than one character, for example, if the collating sequence includes a "ch"
collating element, then the RE "[[.ch.]]*c" matches the first five characters of "chchcc".
Within a bracket expression, a collating element enclosed in "[=" and "=]" is an equivalence class, standing for the sequences of characters of all collating elements
equivalent to that one, including itself. (If there are no other equivalent collating elements, the treatment is as if the enclosing delimiters were "[." and ".]".)
For example, if o and ^ are the members of an equivalence class, then "[[=o=]]", "[[=^=]]", and "[o^]" are all synonymous. An equivalence class may not(!) be an end-
point of a range.
"collating element" 如果看中文的man翻译就是"归并元素"
一直没搞懂这个[..]是怎么用的?
试了一下egrep、grep、sed、awk貌似都不支持。
拿egrep 来说:
regex.log:
0000ae0000ll000
0000ax0000ll000
egrep "[[.ae.]]" regex.log
egrep: Invalid collation character
egrep "[[.a.]]" regex.log
0000ae0000ll000
0000ax0000ll000
如果不支持多个字符的形式(即[.多个字符.]),那粗体部分描述的意思是什么?貌似就失去意义了。
ps:下面那段的"等价类"就更难理解了,更晦涩。 |
|