在Java中,有一些URL解析器吗?
我知道Java中有一个URL类,但我需要方法来获取页面的文件扩展名(html、php、asp等)、域的国家/地区(ca、au、br、jp、fr等) 、页面类型(.net、.org、.gov 等)等。 其中一些方法,我使用了字符串处理,但我认为仅用于此目的的类可能更可靠。
I know there is a URL class in Java, but I need methods to get the file extension of the page (html, php, asp, etc), the country of the domain (ca, au, br, jp, fr, etc), the type of the page(.net, .org, .gov, etc) and others.
Some of these methods, I did using String handling, but I think that a class done only for this can be more confiable.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我创建了一个简单的 Java 类,它使 Java 中的 URL 解析变得更加容易。
https://github.com/juliuss/urlplus
它可用于构建 url 并以编程方式修改它们。它还处理相对 URL。
从单元测试可以看出它非常全面:
I created a simple Java class that makes URL parsing in Java much easier.
https://github.com/juliuss/urlplus
It can be used to build urls and modify them programmatically. It also handles relative urls.
You can see from the unit test it's very comprehensive:
我不确定是否有特定的课程可以满足您的要求。先看一下URL类,再看下面的帖子。
您能否分享一个网址链接解析实现?
我认为您需要将从 URL 类返回的数据和您自己的解析算法结合起来,以获取不可用的小数据。不过,这应该很简单,因为听起来它是主机和路径的最后一个点索引之后的所有内容(如果它们确实存在,则不能保证)。
I am not sure there is a specific class to do what you are asking. Take a look at the URL class first, and the post below.
Could you share a link to an URL parsing implementation?
I think you will need to combine the data returned from the URL class, and your own parsing algorithm to get the small bits of data that are not available. This should be pretty simple to do though, as it sounds like it is everything after the last index of dot for the host and the path (if they actually exist, which is not guaranteed).
不,没有这样的课程。其中一些内容(国家/地区代码)不恰当且不明确,通常无法仅通过 URL 来确定。他们不是在解析,而是在查找或推理。大多数页面没有定义其他内容(文件扩展名)。
No, there's no such class. Some of these things (country code) are ill-posed and ambiguous, and often can't be determined from the URL alone. They're not parsing so much as lookup or inference. Other things (file extension) are not defined for most pages.