一个问题,有关匹配域名的,如下,这是我想写一个seo的软件,查一个关键字排名在google中多少位,但有个问题,下面是google的结果,10条一个网页,我想匹配域名出来,看看有没有我要的域名,没有,查下一个网页.在配置域名。
在匹配域名前,我要提取下面这个中的url出来,才能匹配,我不会写shell的脚本来提取.有没有高人提点一下.下面这个得到的内容基本都是一行了,所以只能取的特定的字符.
<span class=a>www.kaili.gov.cn/office/zwnews/ 200704/office_20070413163747_528.shtml - 20k - </span>
需要取得下面中如上面这种的地址出来,有10条,请帮忙看看,怎么用shell匹配上面的内容出来.
<html><head><meta http-equiv=content-type content="text/html; charset=GB2312"><title>扶凯 - Google 搜索</title>
</script></head><body bgcolor=#ffffff topmargin=3 marginheight=3><div id=gbar><nobr><span class=gb1><b>网页</b></span> <span class=gb1><a href="http://images.google.cn/images?q=%E6%89%B6%E5%87%AF&um=1&ie=UTF-8&sa=N&tab=wi" onclick=gbar.qs(this)>图片</a></span> <span class=gb1><a href="http://ditu.google.cn/maps?q=%E6%89%B6%E5%87%AF&um=1&ie=UTF-8&sa=N&tab=wl" onclick=gbar.qs(this)>地图</a></span> <span class=gb1><a href="http://news.google.cn/news?q=%E6%89%B6%E5%87%AF&um=1&ie=UTF-8&sa=N&tab=wn" onclick=gbar.qs(this)>资讯</a></span> <span class=gb1><a href="http://video.google.cn/videosearch?q=%E6%89%B6%E5%87%AF&um=1&ie=UTF-8&sa=N&tab=wv" onclick=gbar.qs(this)>视频</a></span> <span class=gb1><a href="http://blogsearch.google.cn/blogsearch?q=%E6%89%B6%E5%87%AF&um=1&ie=UTF-8&sa=N&tab=wb" onclick=gbar.qs(this)>博客</a></span> <span class=gb3><a href="http://www.google.cn/intl/zh-CN/options/" onclick="this.blur();gbar.tg(event);return !1"><u>更多</u> <small>▼</small></a></span> <span class=gb2><a href="http://shenghuo.google.cn/shenghuo/search?q=%E6%89%B6%E5%87%AF&um=1&ie=UTF-8&sa=N&tab=w8" onclick=gbar.qs(this)>生活</a></span> <span class=gb2><a href="http://www.google.cn/rebang/search?q=%E6%89%B6%E5%87%AF&um=1&ie=UTF-8&sa=N&tab=w9" onclick=gbar.qs(this)>热榜</a></span> <span class=gb2><a href="http://daohang.google.cn/?q=%E6%89%B6%E5%87%AF&um=1&ie=UTF-8&sa=N&tab=wA" onclick=gbar.qs(this)>网站导航</a></span> <span class=gb2><div></div></a></span> <span class=gb2><a href="http://www.google.com/calendar/render?um=1&ie=UTF-8&sa=N&tab=wc">日历</a></span> <span class=gb2><a href="http://picasaweb.google.com/lh/searchbrowse?q=%E6%89%B6%E5%87%AF&um=1&ie=UTF-8&sa=N&tab=wq" onclick=gbar.qs(this)>照片</a></span> <span class=gb2><a href="http://docs.google.com/?um=1&ie=UTF-8&sa=N&tab=wo">文档</a></span> <span class=gb2><div></div></a></span> <span class=gb2><a href="http://www.google.cn/intl/zh-CN/options/">更多 »</a></span> </nobr></div><div class=gbh style=left:0></div><div class=gbh style=right:0></div><div align=right id=guser style="font-size:84%;padding:0 0 4px" width=100%><nobr><a href="https://www.google.com/accounts/Login?continue=http://www.google.cn/search%3Fq%3D%25E6%2589%25B6%25E5%2587%25AF&hl=zh-CN">登录</a></nobr></div><table class=tb style=clear:left width=100%><tr><form name=gs method=GET action="/search"><td class=tc valign=top><a href="http://www.google.cn/webhp?hl=zh-CN&complete=1" title="Google 首页"><img src="/images/logo_sm.gif" width=150 height=55 alt=Google border=0 vspace=12></a></td><td style="padding:0 0 7px;padding-left:8px" valign=top width=100%><table class=tb style="margin-top:25px"><tr><td class=tc nowrap><input type=hidden name=complete value=1><input type=hidden name=hl value="zh-CN"><input type=hidden name=newwindow value=1><input type=hidden name=ie value="GB2312"><input autocomplete="off" type=text name=q size=41 maxlength=2048 value="扶凯" title="Google 搜索"> <input type=submit name="btnG" value="Google 搜索"></td><td class=tc nowrap width=100%><span id=ap> <a href="/advanced_search?q=%E6%89%B6%E5%87%AF&complete=1&hl=zh-CN&newwindow=1&ie=UTF-8">高级搜索</a> | <a href="/preferences?q=%E6%89%B6%E5%87%AF&complete=1&hl=zh-CN&newwindow=1&ie=UTF-8">使用偏好</a></span></td></tr><tr><td class=tc colspan=2><font size=-1> <input id=all type=radio name=meta value="" checked><label for=all>所有网页 </label><input id=ch type=radio name=meta value="lr=lang_zh-CN|lang_zh-TW"><label for=ch>中文网页 </label><input id=lgr type=radio name=meta value="lr=lang_zh-CN"><label for=lgr>简体中文网页 </label><input id=cty type=radio name=meta value="cr=countryCN"><label for=cty>中国的网页 </label></font></td></tr></table></td></tr></form></table><script src="/extern_js/f/CgV6aC1DThICY24gDiswCjgDLA/sU0He6l7Lhs.js"></script><table border=0 cellpadding=0 cellspacing=0 width=100%><tr><td bgcolor="#3366cc"><img width=1 height=1 alt=""></table><table border=0 cellpadding=0 cellspacing=0 width=100% bgcolor="#d5ddf3"><tr><td nowrap><font size=+1> <b>网页 </b></font></td><td align=right nowrap><font size=-1>约有<b>6,840</b>项符合<b>扶凯</b>的查询结果,以下是第<b>1</b>-<b>10</b>项 (搜索用时 <b>0.23</b> 秒) </font></table><div id=res><div><div class=g> <a href="http://www.php-oa.com/" target=_blank class=l><font color="#cc0033">扶凯</font></a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j"><div class=std>Centos,Ubuntu,Rhel,RHCE,<font color="#cc0033">扶凯</font>. <b>...</b> 消灭0回复Linux相关文章Jun 19th,2008<font color="#cc0033">扶凯</font>. Expires、Cache-Control、Last-Modified、ETag是RFC 2616(HTTP/1.1)协议中和网页缓存 <b>...</b><br><span class=a>www.php-oa.com/ - 33k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:rTKsFs8VC88J:
www.php-oa.com/+%E6%89%B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=1&gl=cn&ie=UTF-8&st_usg=ALhdy285m5BrthVCwd2C4HX7I5DlgujHww" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:
www.php-oa.com/">类似网页</a></nobr></div> </td></tr></table></div><div class=g style="margin-left:2.5em"> <a href="http://www.php-oa.com/page/5/" target=_blank class=l><font color="#cc0033">扶凯</font>- Part 5</a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j hc"><div class=std><font color="#cc0033">扶凯</font>: 有空看... jsw7001: 这方法不好,俺这有方法下载http:... 王园园: 无需重启. 好... tooo: 能不能把这个MBR重安装方式写出来... cncoolker: 这是谬论. <b>...</b><br><span class=a>www.php-oa.com/page/5/ - 33k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:czK9QN0ri3gJ:
www.php-oa.com/page/5/+%E6%89%B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=2&gl=cn&ie=UTF-8&st_usg=ALhdy2-FOsMlG0qCLCRu6ny-5_6BgqJwkA" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:
www.php-oa.com/page/5/">类似网页</a></nobr><br><a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=+site:
www.php-oa.com+%E6%89%B6%E5%87%AF">www.php-oa.com站内的其它相关信息 »</a></div> </td></tr></table></div><div class=g> <a href="http://www.zhuaxia.com/pre_channel/4914767/?logId=190" target=_blank class=l><font color="#cc0033">扶凯</font>- 频道预览- 抓虾</a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j"><div class=std>师夷之长技以治夷<font color="#cc0033">扶凯</font>- 频道预览- 抓虾抓虾-全球最大的中文订阅服务商.<br><span class=a>www.zhuaxia.com/pre_channel/4914767/?logId=190 - 186k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:klqHtFS8KP8J:
www.zhuaxia.com/pre_channel/4914 ... B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=3&gl=cn&ie=UTF-8&st_usg=ALhdy2_rwvzUIzaFsMEUddgxc9epZLxh4g" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:
www.zhuaxia.com/pre_channel/4914767/?logId=190">类似网页</a></nobr></div> </td></tr></table></div><div class=g> <a href="http://tieba.baidu.com/f?kz=340984565" target=_blank class=l>百度_长寿中学吧_卓芸宇和我<font color="#cc0033">扶凯</font>是帅哥</a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j"><div class=std><font color="#cc0033">扶凯</font>真的是帅哥,好帅呀。 跟我耍嘛ァ <b>...</b> 星期5戴起帽子那个是<font color="#cc0033">扶凯</font> 好帅 无法想象 <b>...</b> <font color="#cc0033">扶凯</font>我的好朋友卅 确实狠帅 佩服~!! 星期五我们一路的 <b>...</b><br><span class=a>tieba.baidu.com/f?kz=340984565 - 83k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:SdYaHSNwLUYJ:tieba.baidu.com/f%3Fkz%3D340984565+%E6%89%B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=4&gl=cn&ie=UTF-8&st_usg=ALhdy2-vOI0eQh_vdPbc0XYB-lVoUMGkFA" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:tieba.baidu.com/f?kz=340984565">类似网页</a></nobr></div> </td></tr></table></div><div class=g> <a href="http://www.mangbar.com/document/8a80809d1a6cba2e011a75957817223f" target=_blank class=l>学习CDN不得不读之-Squid优化补遗|<font color="#cc0033">扶凯</font>- MangBar</a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j"><div class=std>在线工作室平台(Online Collaborative Workshop). 集个人/团队工具和协作社区于一身. 适合于思路整理, 任务管理, 项目协作, 团队协作.<br><span class=a>www.mangbar.com/document/ 8a80809d1a6cba2e011a75957817223f - 29k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:40iPRW0xg2IJ:
www.mangbar.com/document/8a80809 ... B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=5&gl=cn&ie=UTF-8&st_usg=ALhdy28zhSMZ4Y8IQNVQlTfqe-TXt2dyHA" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:
www.mangbar.com/document/8a80809d1a6cba2e011a75957817223f">类似网页</a></nobr></div> </td></tr></table></div><div class=g> <a href="http://www.hpvbbs.cn/hpv/article.aspx/20832-1.htm" target=_blank class=l>非淋清,协同逸,世保<font color="#cc0033">扶,凯</font>仑丸,能治非淋前列腺吗?-尖锐湿疣战友论坛</a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j"><div class=std>发表于: 2006-1-12 12:28:04, 非淋清,协同逸,世保<font color="#cc0033">扶,凯</font>仑丸,能治非淋前列腺吗? 请问下面的方子治非淋性前列腺炎有效果吗?是我在另一个地方发帖询问的。 <b>...</b><br><span class=a>www.hpvbbs.cn/hpv/article.aspx/20832-1.htm - 21k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:dH7nyPq0FfUJ:
www.hpvbbs.cn/hpv/article.aspx/2 ... B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=6&gl=cn&ie=UTF-8&st_usg=ALhdy29KqNctKG2yRiWXiPGLccSwX-BxuQ" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:
www.hpvbbs.cn/hpv/article.aspx/20832-1.htm">类似网页</a></nobr></div> </td></tr></table></div><div class=g> <a href="http://blog.chinaunix.net/u/12859/guestbook.html" target=_blank class=l>我的留言- <font color="#cc0033">扶凯</font></a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j"><div class=std>中国最大的IT技术博客-ChinaUnix博客:我的留言- <font color="#cc0033">扶凯</font>.<br><span class=a>blog.chinaunix.net/u/12859/guestbook.html - 33k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:igbF-6on1-gJ:blog.chinaunix.net/u/12859/guestbook.html+%E6%89%B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=7&gl=cn&ie=UTF-8&st_usg=ALhdy29OwEnwCl0lOXBGB4IQALxxjD0euQ" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:blog.chinaunix.net/u/12859/guestbook.html">类似网页</a></nobr></div> </td></tr></table></div><div class=g style="margin-left:2.5em"> <a href="http://blog.chinaunix.net/u/12859/showart_366523.html" target=_blank class=l>谁来拯救中国男人- 品味生活- <font color="#cc0033">扶凯</font></a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j hc"><div class=std>中国最大的IT技术博客-ChinaUnix博客:谁来拯救中国男人- 品味生活- <font color="#cc0033">扶凯</font>.<br><span class=a>blog.chinaunix.net/u/12859/showart_366523.html - 25k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:fy4XbIuEu8AJ:blog.chinaunix.net/u/12859/showart_366523.html+%E6%89%B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=8&gl=cn&ie=UTF-8&st_usg=ALhdy29MtfwyjdGu8HwsQWXk7nNNI_ITFg" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:blog.chinaunix.net/u/12859/showart_366523.html">类似网页</a></nobr></div> </td></tr></table></div><div class=g> <a href="http://www.kaili.gov.cn/office/zwnews/200704/office_20070413163747_528.shtml" target=_blank class=l>上海宋庆龄基金会和美国扶生基金会热情帮<font color="#cc0033">扶凯</font>里市贫困学生</a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j"><div class=std>2007年4月6日,上海宋庆龄基金会和美国扶生基金会帮<font color="#cc0033">扶凯</font>里贫困学生捐赠仪式在凯里市行政中心A座5楼会议室隆重举行。仪式由市政协副主席邓友军主持。 <b>...</b><br><span class=a>www.kaili.gov.cn/office/zwnews/ 200704/office_20070413163747_528.shtml - 20k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:xMlJSM0gE-IJ:
www.kaili.gov.cn/office/zwnews/2 ... B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=9&gl=cn&ie=UTF-8&st_usg=ALhdy2_96YoCjhwGxOd7zA2X1TVPnuHddw" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:
www.kaili.gov.cn/office/zwnews/2 ... 413163747_528.shtml">类似网页</a></nobr></div> </td></tr></table></div><div class=g> <a href="http://www.gzsedu.cn/jiangkou/NewsView.aspx?id=211" target=_blank class=l>江口教育网-贵州教育网网群江口节点</a><table border=0 cellpadding=0 cellspacing=0><tr><td class="j"><div class=std>江口县环保局帮<font color="#cc0033">扶凯</font>文小学纪实. 8月6日 ,太平乡教办蒋跃发主任得知江口县环保局在今年的部门帮扶学校“两基”、“普实”工作中尚未安排帮扶任务,便向环保局领导请求帮扶 <b>...</b><br><span class=a>www.gzsedu.cn/jiangkou/NewsView.aspx?id=211 - 24k - </span><nobr><a class=fl href="http://203.208.37.104/search?q=cache:fQqw6deW5sYJ:
www.gzsedu.cn/jiangkou/NewsView. ... B6%E5%87%AF&hl=zh-CN&ct=clnk&cd=10&gl=cn&ie=UTF-8&st_usg=ALhdy29q6Xkh-73CBbAHFF7Gsk7BLWa7gw" target=_blank>网页快照</a> - <a class=fl href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=related:
www.gzsedu.cn/jiangkou/NewsView.aspx?id=211">类似网页</a></nobr></div> </td></tr></table></div> <p><i> </i></p></div> <br clear="all"/><div id=navbar class=n><table border=0 cellpadding=0 width="1%" cellspacing=0 align=center><tr align=center style=text-align:center valign=top><td nowrap align=right class=b><img src="nav_first.gif" width=18 height=26 alt="" border=0><br><td nowrap><img src="nav_current.gif" width=16 height=26 alt="" border=0><br><span class=i>1</span><td nowrap><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=10&sa=N"><img src="nav_page.gif" width=16 height=26 alt="" border=0><br><span>2</span></a><td nowrap><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=20&sa=N"><img src="nav_page.gif" width=16 height=26 alt="" border=0><br><span>3</span></a><td nowrap><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=30&sa=N"><img src="nav_page.gif" width=16 height=26 alt="" border=0><br><span>4</span></a><td nowrap><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=40&sa=N"><img src="nav_page.gif" width=16 height=26 alt="" border=0><br><span>5</span></a><td nowrap><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=50&sa=N"><img src="nav_page.gif" width=16 height=26 alt="" border=0><br><span>6</span></a><td nowrap><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=60&sa=N"><img src="nav_page.gif" width=16 height=26 alt="" border=0><br><span>7</span></a><td nowrap><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=70&sa=N"><img src="nav_page.gif" width=16 height=26 alt="" border=0><br><span>8</span></a><td nowrap><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=80&sa=N"><img src="nav_page.gif" width=16 height=26 alt="" border=0><br><span>9</span></a><td nowrap><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=90&sa=N"><img src="nav_page.gif" width=16 height=26 alt="" border=0><br><span>10</span></a><td nowrap class=b><a href="/search?complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&q=%E6%89%B6%E5%87%AF&start=10&sa=N"><img src="nav_next.gif" width=100 height=26 alt="" border=0><br><span><b>下一页</b></span></a></table></div></div> <br clear=all><center></center><br><table border=0 cellpadding=0 cellspacing=0 width=100% bgcolor="#d5ddf3"><tr><td bgcolor="#3366cc"><img width=1 height=1 alt=""></td></tr><tr><td align=center> <br><table border=0 cellpadding=0 cellspacing=0 align=center><form method=GET action="/search"><tr><td nowrap>
<font size=-1><input type=text name=q size=41 maxlength=2048 value="扶凯" title="Google 搜索"> <input type=submit name="btnG" value="Google 搜索"><input type=hidden name=complete value=1><input type=hidden name=hl value="zh-CN"><input type=hidden name=newwindow value=1><input type=hidden name=ie value="GB2312"><input type=hidden name=sa value="2"></font></td></tr></form></table><br><font size=-1><nobr><a href="/swr?q=%E6%89%B6%E5%87%AF&complete=1&hl=zh-CN&newwindow=1&ie=UTF-8&swrnum=6840">在结果中搜索</a></nobr> | <nobr><a href="/language_tools?q=%E6%89%B6%E5%87%AF&complete=1&hl=zh-CN&newwindow=1&ie=UTF-8">语言工具</a></nobr> | <nobr><a href="/intl/zh-CN/help.html">搜索帮助</a></nobr> | <nobr><a href="/quality_form?q=%E6%89%B6%E5%87%AF&complete=1&hl=zh-CN&newwindow=1&ie=UTF-8" target=_blank>意见反馈</a></nobr> | <nobr><a href="/experimental/">尝试 Google 实验版</a></nobr></font><br><br></td></tr><tr><td bgcolor="#3366cc"><img width=1 height=1 alt=""></td></tr></table><center><p><hr class=z><div style=padding:2px class=""><font size=-1>©2008 Google - <a href="/">Google 首页</a> - <a href="/intl/zh-CN/ads/">广告计划</a> - <a href="/intl/zh-CN/about.html">Google 大全</a></font></div><br></center><script>window.setTimeout('window.google.ac.install(document.gs,document.gs.q,"",false,"关闭",true,"","扶凯")',100);</script></body></html>