Python 正则表达式:字符串不包含“jpg”并且必须有“-”和小写

发布于 2024-10-01 00:57:21 字数 870 浏览 4 评论 0原文

我在找出 django url 的 python 正则表达式时遇到了麻烦。我有一定的标准,但似乎无法想出神奇的公式。最后,我可以识别哪个页面是 CMS 页面,并将其应该加载的别名 url 传递给 django 函数。

以下是匹配的有效字符串的一些示例:

  • about-us
  • contact-us
  • terms-and-conditions
  • info/learn-more-pg2
  • info/my-example-url

标准:

  • 必须全部小写
  • 必须包含破折号“-”
  • 可以包含数字、字母和斜杠“/”
  • 长度必须至少 4 个字符,最多 30 个字符
  • 不能包含特殊字符
  • 不能包含以下单词:
    • .jpg
    • .gif
    • .png
    • .css
    • .js

不应匹配的示例:

  • About-Us(大写)
  • contactus(没有破折号)
  • pg(少于 4 个字符)
  • img/bg.gif (包含“.gif”)
  • files/my-styles.css(包含“.css”)
  • my-page@(包含字母、数字、破折号或斜杠以外的字符)

我知道这还没有接近,但是这个据我所知:

(?P<alias>([a-z/-]{4,30}))

我很抱歉有很大的要求,但我就是无法理解这些正则表达式的东西。

谢谢!

I'm having troubles figuring out a python regex for django urls. I have a certain criteria, but can't seem to come up with the magic formula. In the end its so I can identify which page is a CMS page and pass to the django function the alias url it should load.

Here are some examples of valid strings which would match:

  • about-us
  • contact-us
  • terms-and-conditions
  • info/learn-more-pg2
  • info/my-example-url

Criteria:

  • Must be all lowercase
  • Must contain a dash "-"
  • Can contain numbers, letters and a slash "/"
  • Must be at least 4 characters long and a max of 30 characters
  • Cannot contain special characters
  • Cannot contain the words:
    • .jpg
    • .gif
    • .png
    • .css
    • .js

Examples which should not match:

  • About-Us (has upper case)
  • contactus (doesn't have a dash)
  • pg (less than 4 characters)
  • img/bg.gif (contains ".gif")
  • files/my-styles.css (contains ".css")
  • my-page@ (has a character other than letters, numbers, dash or slash)

I know this isn't even close yet, but this is as far as I've gotten:

(?P<alias>([a-z/-]{4,30}))

I apologize for having large requirements, but I just can't get my head wrapped around this regex stuff.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

甜嗑 2024-10-08 00:57:22

这是我关于 SO 的第一篇文章。
请在需要时纠正我的英语,我确实请求你。

我认为以下任何 RE 都适合:

'(?=.{4,30}\Z)(?=.*-)[-a-z0-9/]+\Z'

'(?=.{4,30}\Z)[a-z0-9/]\*-[-a-z0-9/]\*\Z'

'(?=.{4,30}\Z)(?:[a-z0-9/]+|)-[-a-z0-9/]*\Z'

Here’s my first post on SO.
Pleeaaase, correct my english whenever it will be needed, I do ask you.

I think that any of the following REs fits right:

'(?=.{4,30}\Z)(?=.*-)[-a-z0-9/]+\Z'

'(?=.{4,30}\Z)[a-z0-9/]\*-[-a-z0-9/]\*\Z'

'(?=.{4,30}\Z)(?:[a-z0-9/]+|)-[-a-z0-9/]*\Z'
羞稚 2024-10-08 00:57:21

我很困惑为什么一些评​​论员发现这在正则表达式中很难做到。这正是正则表达式所擅长的。

if re.match(
    r"""^             # match start of the string
    (?=.*-)           # assert that there is a dash
    (?!.*\.(?:jpg|gif|png|css|js))  # assert that these words can't be matched
    [a-z0-9/-]{4,30}  # match 4-30 of the allowed characters
    $                 # match the end of the string""", 
    subject, re.VERBOSE):
    # Successful match at the start of the string
else:
    # Match attempt failed

然而,由于 . 不在允许的字符范围内,因此确实没有必要检查禁止的文件扩展名。

I'm puzzled as to why several of the commentators find that this is hard to do in a regex. This is exactly what regular expressions are good at.

if re.match(
    r"""^             # match start of the string
    (?=.*-)           # assert that there is a dash
    (?!.*\.(?:jpg|gif|png|css|js))  # assert that these words can't be matched
    [a-z0-9/-]{4,30}  # match 4-30 of the allowed characters
    $                 # match the end of the string""", 
    subject, re.VERBOSE):
    # Successful match at the start of the string
else:
    # Match attempt failed

It is true however that since the . isn't among the allowed characters, the check for the forbidden file extensions is not really necessary.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文