免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
楼主: r2007
打印 上一主题 下一主题

怎样查找双写字母的单词,如look,hello  关闭 [复制链接]

论坛徽章:
0
11 [报告]
发表于 2004-03-19 23:26 |只看该作者

怎样查找双写字母的单词,如look,hello

$cat tmp
hello
look
sed
All
awk
who
gooogle
aoooooooooooooo0d
result:
hello
look
All
gooogle
aoooooooooooooo0d
ps:
perl的脚本我不太会写了

  1. #!perl
  2. @line=`cat tmp`;
  3. foreach (@line) {
  4.         /(.){2,}\1.*/;
  5.         print "$&\n";
  6. }
复制代码

论坛徽章:
0
12 [报告]
发表于 2004-03-20 10:33 |只看该作者

怎样查找双写字母的单词,如look,hello

版主的办法好牛啊,bjgirl的uniq -c和 && 的方法也很精巧。说来惭愧,我的思路其实是抄袭bjgirl的。版主第二个方法里面-F""选项在我的sco+ksh下面不能用,要是能用,我也不用在前面写那么复杂的一个sed了,呵呵。

论坛徽章:
7
荣誉版主
日期:2011-11-23 16:44:17子鼠
日期:2014-07-24 15:38:07狮子座
日期:2014-07-24 11:00:54巨蟹座
日期:2014-07-21 19:03:10双子座
日期:2014-05-22 12:00:09卯兔
日期:2014-05-08 19:43:17卯兔
日期:2014-08-22 13:39:09
13 [报告]
发表于 2004-03-20 10:44 |只看该作者

怎样查找双写字母的单词,如look,hello

[quote]原帖由 "forest077"]版主的办法好牛啊,bjgirl的uniq -c和 && 的方法也很精巧。说来惭愧,我的思路其实是抄袭bjgirl的。版主第二个方法里面-F""选项在我的sco+ksh下面不能用,要是能用,我也不用在前面写那么复杂的一个sed了,呵呵。[/quote 发表:

-F ""请查一下man,由于设置为null,所以中间必须有空格。

论坛徽章:
0
14 [报告]
发表于 2004-03-20 10:57 |只看该作者

怎样查找双写字母的单词,如look,hello

我在sco下面的man没有关于分隔符为null的说明,我试了-F"",-F" ",-F ""都达不到效果。

论坛徽章:
7
荣誉版主
日期:2011-11-23 16:44:17子鼠
日期:2014-07-24 15:38:07狮子座
日期:2014-07-24 11:00:54巨蟹座
日期:2014-07-21 19:03:10双子座
日期:2014-05-22 12:00:09卯兔
日期:2014-05-08 19:43:17卯兔
日期:2014-08-22 13:39:09
15 [报告]
发表于 2004-03-20 11:11 |只看该作者

怎样查找双写字母的单词,如look,hello

在awk中直接定义呢?
  1. BEGIN{FS=""}
复制代码

论坛徽章:
0
16 [报告]
发表于 2004-03-20 13:11 |只看该作者

怎样查找双写字母的单词,如look,hello

原帖由 "r2007" 发表:

yes
007版主,多找些这样的题,让我们练练

论坛徽章:
0
17 [报告]
发表于 2004-03-21 10:47 |只看该作者

怎样查找双写字母的单词,如look,hello

一个比较传统编程的思路:
$ cat file | awk '{ for( i=1; i<length($1); i++) if(substr($1,i,1)==substr($1,i+1,
1)) {print $1,break} }'

论坛徽章:
0
18 [报告]
发表于 2004-03-22 02:10 |只看该作者

怎样查找双写字母的单词,如look,hello

http://www.chinaunix.net/jh/24/149723.html
我发现在这篇文章中已经有了
  1. “例如,'(.)1' 匹配两个连续的相同字符。”
  2. “\n   标识一个八进制转义值或一个后向引用。如果 \n 之前至少 n 个获取的子表达式,则 n 为后向引用。否则,如果 n 为八进制数字 (0-7),则 n 为一个八进制转义值。”
复制代码

的阐述。可恨没有好好学习。

论坛徽章:
0
19 [报告]
发表于 2004-03-22 13:12 |只看该作者

怎样查找双写字母的单词,如look,hello

原帖由 "skylove" 发表:
汗。。。
var=$(echo $i|sed 's/./&\
/g'|uniq -c)

这句看不懂得。。。 s表示替换 .表示任意字符。。g表示全局。。那&表示的意思是??? 还有uniq的作用是什么?以前没用过


man了一下.
&表示前面和.匹配的字符,
a \
text   Append text, which has each embedded newline preceded by a back-slash.
i \
text  Insert text, which has each embedded newline preceded by a back-slash.
应该知道&\什么意识了吧.

uniq -c
       -c, --count
              prefix lines by the number of occurrences

论坛徽章:
0
20 [报告]
发表于 2004-03-22 13:32 |只看该作者

怎样查找双写字母的单词,如look,hello

The Back-reference Operator ("\"DIGIT)
======================================

  If the syntax bit `RE_NO_BK_REF' isn't set, then Regex recognizes
back references.  A back reference matches a specified preceding group.
The back reference operator is represented by `\DIGIT' anywhere after
the end of a regular expression's DIGIT-th group (*note Grouping
Operators:.

  DIGIT must be between `1' and `9'.  The matcher assigns numbers 1
through 9 to the first nine groups it encounters.  By using one of `\1'
through `\9' after the corresponding group's close-group operator, you
can match a substring identical to the one that the group does.

  Back references match according to the following (in all examples
below, `(' represents the open-group, `)' the close-group, `{' the
open-interval and `}' the close-interval operator):

   * If the group matches a substring, the back reference matches an
     identical substring.  For example, `(a)\1' matches `aa' and
     `(bana)na\1bo\1' matches `bananabanabobana'.  Likewise, `(.*)\1'
     matches any (newline-free if the syntax bit `RE_DOT_NEWLINE' isn't
     set) string that is composed of two identical halves; the `(.*)'
     matches the first half and the `\1' matches the second half.

   * If the group matches more than once (as it might if followed by,
     e.g., a repetition operator), then the back reference matches the
     substring the group _last_ matched.  For example, `((a*)b)*\1\2'
     matches `aabababa'; first group 1 (the outer one) matches `aab'
     and group 2 (the inner one) matches `aa'.  Then group 1 matches
     `ab' and group 2 matches `a'.  So, `\1' matches `ab' and `\2'
     matches `a'.

  * If the group doesn't participate in a match, i.e., it is part of an
     alternative not taken or a repetition operator allows zero
     repetitions of it, then the back reference makes the whole match
     fail.  For example, `(one()|two())-and-(three\2|four\3)' matches
     `one-and-three' and `two-and-four', but not `one-and-four' or
     `two-and-three'.  For example, if the pattern matches `one-and-',
     then its group 2 matches the empty string and its group 3 doesn't
     participate in the match.  So, if it then matches `four', then
     when it tries to back reference group 3--which it will attempt to
     do because `\3' follows the `four'--the match will fail because
     group 3 didn't participate in the match.


  You can use a back reference as an argument to a repetition operator.
For example, `(a(b))\2*' matches `a' followed by two or more `b's.
Similarly, `(a(b))\2{3}' matches `abbbb'.

  If there is no preceding DIGIT-th subexpression, the regular
expression is invalid.
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP