免费注册 查看新帖 |

Chinaunix

  平台 论坛 博客 文库
最近访问板块 发新帖
查看: 5124 | 回复: 4
打印 上一主题 下一主题

如何用C/C++写一个判断url是否有效的函数 [复制链接]

论坛徽章:
0
跳转到指定楼层
1 [收藏(0)] [报告]
发表于 2006-01-06 18:59 |只看该作者 |倒序浏览
能不能写这样一个函数, 传入一个链接(http://www.aaa.com/aa/aa/aa), 返回一个bool型值, 链接有效返回 真, 否则为假

论坛徽章:
6
2015年迎新春徽章
日期:2015-03-04 09:48:00IT运维版块每日发帖之星
日期:2015-06-11 22:20:00IT运维版块每日发帖之星
日期:2015-08-23 06:20:00数据库技术版块每日发帖之星
日期:2015-10-24 06:20:00IT运维版块每日发帖之星
日期:2015-12-24 06:20:00IT运维版块每日发帖之星
日期:2016-04-15 06:20:00
2 [报告]
发表于 2006-01-06 21:16 |只看该作者
何为有效?一个URL可能返回一个你的期望的结果,也有可能返回一个错误信息(例如:该页面不存在的提示信息),这是你认为这个URL是有效的还是无效的?

论坛徽章:
0
3 [报告]
发表于 2006-01-08 21:34 |只看该作者
acm的练习题目,做过,但是code被哇卡卡了
注意你的题目定义的不是很准确
如果是下面这个题目,那么用有限状态机就ok了

http://acm.zju.edu.cn/show_problem.php?pid=1243




--------------------------------------------------------------------------------

URLs

--------------------------------------------------------------------------------

Time limit: 1 Seconds   Memory limit: 32768K   
Total Submit: 341   Accepted Submit: 145   

--------------------------------------------------------------------------------

In the early nineties, the World Wide Web (WWW) was invented. Nowadays, most people think that the WWW simply consists of all the pretty (or not so pretty) HTML-pages that you can read with your WWW browser. But back then, one of the main intentions behind the design of the WWW was to unify several existing communication protocols.

Then (and even now), information on the Internet was available via a multitude of channels: FTP, HTTP, E-Mail, News, Gopher, and many more. Thanks to the WWW, all these services can now be uniformly addressed via URLs (Uniform Resource Locators). The syntax of URLs is defined in the Internet standard RFC 1738. For our problem, we consider a simplified version of the syntax, which is as follows:

<protocol> "://" <host> [ ":" <port> ] [ "/" <path> ]

The square brackets [] mean that the enclosed string is optional and may or may not appear. Examples of URLs are the following:

http://www.informatik.uni-ulm.de/acm
ftp://acm.baylor.edu:1234/pub/staff/mr-p
gopher://veryold.edu

More specifically,

<protocol> is always one of http, ftp or gopher.

<host> is a string consisting of alphabetic (a-z, A-Z) or numeric (0-9) characters and points (.).

<port> is a positive integer, smaller than 65536.

<path> is a string that contains no spaces.

You are to write a program that parses an URL into its components.


Input

The input starts with a line containing a single integer n, the number of URLs in the input. The following n lines contain one URL each, in the format described above. The URLs will consist of at most 60 characters each.


Output

For each URL in the input first print the number of the URL, as shown in the sample output. Then print four lines, stating the protocol, host, port and path specified by the URL. If the port and/or path are not given in the URL, print the string <default> instead. Adhere to the format shown in the sample output.

Print a blank line after each test case.


Sample Input

3
ftp://acm.baylor.edu:1234/pub/staff/mr-p
http://www.informatik.uni-ulm.de/acm
gopher://veryold.edu


Sample Output

URL #1
Protocol = ftp
Host     = acm.baylor.edu
Port     = 1234
Path     = pub/staff/mr-p

URL #2
Protocol = http
Host     = www.informatik.uni-ulm.de
Port     = <default>
Path     = acm

URL #3
Protocol = gopher
Host     = veryold.edu
Port     = <default>
Path     = <default>



--------------------------------------------------------------------------------
Problem Source: Southwestern Europe 1997, Practice
--------------------------------------------------------------------------------

Submit   Back   Status

--------------------------------------------------------------------------------

Zhejiang University Online Judge V1.0

论坛徽章:
0
4 [报告]
发表于 2006-01-09 08:51 |只看该作者
对不起我的题目没有说清楚,
我的意思是写这样一个函数, 传入值是一个合法的 url 地址, 然后返回一个 bool 型值, 如果这个把这个 url 输入到 IE 里可以正常打开页面, 则返回 真, 如果 "该页面不存在" 则返回 假

论坛徽章:
0
5 [报告]
发表于 2006-01-09 18:34 |只看该作者
有人写过吗
您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

  

北京盛拓优讯信息技术有限公司. 版权所有 京ICP备16024965号-6 北京市公安局海淀分局网监中心备案编号:11010802020122 niuxiaotong@pcpop.com 17352615567
未成年举报专区
中国互联网协会会员  联系我们:huangweiwei@itpub.net
感谢所有关心和支持过ChinaUnix的朋友们 转载本站内容请注明原作者名及出处

清除 Cookies - ChinaUnix - Archiver - WAP - TOP