免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 4045 | 回复: 5
打印 上一主题 下一主题

sed 中 \w \s 等字符也是被支持的吗? [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2011-05-24 11:02 |只看该作者 |倒序浏览
刚才看到一个帖子,生出了两个疑问。

1 sed不是仅仅支持BRE吗?\s,\w之类应该是PCRE中的,为什么如下两个命令是支持的呢?

#echo "12 34" | sed 's/\s/X/'
12X34

[root@dns-pub /etc/namedb]
#echo "12 34" | sed 's/\w/X/'
X2 34

我在http://www.faqs.org/faqs/editor-faq/sed/ 查了一下,只有在6.8.3的D小节(sedmod v1.0)指出了sed支持 \s,该小节同时指出sed还支持\d ,但是经过我测试 \d 却又不被支持:
[root@dns-pub /etc/namedb]
#echo "12 34" | sed 's/\d/X/'
12 34

sed支持以 \ 开头的哪些字符?有没有权威的文档可以查查的?

2 [\w] 表示的应该是"\" 或者 "w",[\s] 表示的应该是"\" 或者 "s"。测试结果也没有问题:
[root@dns-pub /etc/namedb]
#echo "s12 34w" | sed 's/[\w]/X/g'
s12 34X

[root@dns-pub /etc/namedb]
#echo "s12 34w" | sed 's/[\s]/X/g'
X12 34w

但是为什么 [\t] 依然表示 TAB?测试结果如下:
[root@dns-pub /etc/namedb]
#echo "12\t34" | sed 's/[\t]/X/'
12\t34

[root@dns-pub /etc/namedb]
#echo -e "12\t34" | sed 's/[\t]/X/'
12X34

这个是不是表示在[]中,最好把\放在最后,以免造成误伤?

我的环境
OS:rhel5.3
sed version 4.1.5

论坛徽章:
0
2 [报告]
发表于 2011-05-24 11:04 |只看该作者
支持

论坛徽章:
1
摩羯座
日期:2014-12-29 15:59:36
3 [报告]
发表于 2011-05-24 13:14 |只看该作者
http://www.gnu.org/software/sed/manual/sed.html

谁写的sed 当然看谁的文档 ,太详细了

你的所有疑问都可解

论坛徽章:
2
射手座
日期:2014-10-10 15:59:4715-16赛季CBA联赛之上海
日期:2016-03-03 10:27:14
4 [报告]
发表于 2011-05-24 13:18 |只看该作者
回复 3# ziyunfei


   

论坛徽章:
0
5 [报告]
发表于 2011-05-24 15:25 |只看该作者
回复 3# ziyunfei


    非常非常感谢你的回复,偶查错文档了,不好意思中……

贴一下文档中的原文:

第1个问题,所有以反斜线开头的特殊字符列表:

\a
    Produces or matches a bel character, that is an “alert” (ascii 7).
\f
    Produces or matches a form feed (ascii 12).
\n
    Produces or matches a newline (ascii 10).
\r
    Produces or matches a carriage return (ascii 13).
\t
    Produces or matches a horizontal tab (ascii 9).
\v
    Produces or matches a so called “vertical tab” (ascii 11).
\cx
    Produces or matches Control-x, where x is any character. The precise effect of ‘\cx’ is as follows: if x is a lower case letter, it is converted to upper case. Then bit 6 of the character (hex 40) is inverted. Thus ‘\cz’ becomes hex 1A, but ‘\c{’ becomes hex 3B, while ‘\c;’ becomes hex 7B.
\dxxx
    Produces or matches a character whose decimal ascii value is xxx.
\oxxx
    Produces or matches a character whose octal ascii value is xxx.
\xxx
    Produces or matches a character whose hexadecimal ascii value is xx.

‘\b’ (backspace) was omitted because of the conflict with the existing “word boundary” meaning.

Other escapes match a particular character class and are valid only in regular expressions:

\w
    Matches any “word” character. A “word” character is any letter or digit or the underscore character.
\W
    Matches any “non-word” character.
\b
    Matches a word boundary; that is it matches if the character to the left is a “word” character and the character to the right is a “non-word” character, or vice-versa.
\B
    Matches everywhere but on a word boundary; that is it matches if the character to the left and the character to the right are either both “word” characters or both “non-word” characters.
\`
    Matches only at the start of pattern space. This is different from ^ in multi-line mode.
\'
    Matches only at the end of pattern space. This is different from $ in multi-line mode.


第2个问题,简单点说在 POSIXLY_CORRECT 不为true的情况下,[] 中只有 \n 和 \t 被解析为回车和TAB,其他以反斜线开头的字符都失去了特殊含义。
The characters $, *, ., [, and \ are normally not special within list. For example, [\*] matches either ‘\’ or ‘*’, because the \ is not special here. However, strings like [.ch.], [=a=], and [:space:] are special within list and represent collating symbols, equivalence classes, and character classes, respectively, and [ is therefore special within list when it is followed by ., =, or :. Also, when not in POSIXLY_CORRECT mode, special escapes like \n and \t are recognized within list.

论坛徽章:
0
6 [报告]
发表于 2016-03-15 10:40 |只看该作者
同求啊,楼主帮了我大忙。
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP