java中如何使用通配符进行url匹配

发布于 2025-01-03 09:10:00 字数 864 浏览 0 评论 0原文

我试图将给定的网址与一组过滤条件进行匹配,根据这些条件该网址将被接受或丢弃。这是一个示例模式


http://test.blogs.com/between_the/
http://test.blogs.com/between_the/page*
http://test.blogs.com/between_the/archives*
*index.html*
*/page/*
http://abc.blogs.com/
http://area.test.com/index.php/blogs_a/blog_list/
http://area.test.com/index.php/blogs_b/blog_list/*/

根据条件,将接受以下网址,


http://test.blogs.com/between_the/2012/02/autocad-ws-update-coming.html
http://abc.blogs.com/test
http://area.test.com/index.php/blogs_b/blog_list/page/2

而将过滤以下网址


http://test.blogs.com/between_the/page/2
http://test.blogs.com/index.html
http://area.test.com/index.php/blogs_b/blog_list/1/

只是想知道最好的方法是什么?我不确定是否可以使用复杂的通用正则表达式来处理此问题,因为排除模式不可预测。我正在考虑删除通配符并创建两个单独的列表以进行精确匹配并包含匹配,然后让输入 url 对这两个列表进行迭代。

任何指示将不胜感激。

谢谢

I'm trying to match a given url against a set of filtering conditions based on which the url will be accepted or discarded. Here's a sample pattern


http://test.blogs.com/between_the/
http://test.blogs.com/between_the/page*
http://test.blogs.com/between_the/archives*
*index.html*
*/page/*
http://abc.blogs.com/
http://area.test.com/index.php/blogs_a/blog_list/
http://area.test.com/index.php/blogs_b/blog_list/*/

Based on the condition, the following urls will be accepted


http://test.blogs.com/between_the/2012/02/autocad-ws-update-coming.html
http://abc.blogs.com/test
http://area.test.com/index.php/blogs_b/blog_list/page/2

while the ones below will be filtered


http://test.blogs.com/between_the/page/2
http://test.blogs.com/index.html
http://area.test.com/index.php/blogs_b/blog_list/1/

Just wondering what's the best approach for this ? I'm not sure if this can be handled using a complex generic regex as the exclusion patterns are not predictable. I was thinking of removing the wildcards and create two seperate List for exact match and contains match, then have the input url iterate against the two lists.

Any pointers will be appreciated.

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

-黛色若梦 2025-01-10 09:10:00

您可以简单地创建一个正则表达式列表,并在与任何正则表达式都不匹配时接受 url。 url 一旦与正则表达式匹配就会被丢弃。这应该比创建单个复杂的正则表达式更容易且更易于维护。

You can simply create a List of regular expressions and accept a url when it doesn't match any of the regexes. A url is discarded as soon as it matches a regex. This should be much easier and more maintainable than creating a single complex regular expression.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文