正则表达式 - 任何文本到 URL 友好的表达式

发布于 2024-09-29 22:47:43 字数 589 浏览 9 评论 0原文

PHP 正则表达式脚本可删除任何非字母或数字 0 到 9 的内容，并将空格替换为连字符 - 更改为小写字母，确保只有一个连字符 - 单词之间没有 -- 或 --- 等。

例如：

示例：敏捷的棕色狐狸跳得很快结果：the-quick-brown-fox-jumped

示例：the Quick Brown Fox Jumped！结果：the-quick-brown-fox-jumped

示例：the Quick Brown Fox - 跳了！结果： the-quick-brown-fox-jumped

示例：快速 ~`!@#$%^ &*()_+= ------- Brown {}|][ :"'; < >？.,/狐狸-跳了起来！结果：the-quick-brown-fox-jumped

示例：The Quick 1234567890 ~`!@#$%^ &*()_+= ------- Brown {}|][ :"'; < ;>?.,/ 狐狸 - 跳了起来！结果：the-quick-1234567890-brown-fox-jumped

有人知道正则表达式吗？

谢谢！

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夕嗳→ 2024-10-06 22:47:43

由于您似乎希望所有非字母数字字符序列都被单个连字符替换，因此您可以使用这个：

$str = preg_replace('/[^a-zA-Z0-9]+/', '-', $str);

但这可能会导致前导或尾随连字符可以用 trim：

$str = trim($str, '-');

要将结果转换为小写，请使用strtolower ：

$str = strtolower($str);

所以全部放在一起：

$str = strtolower($str);
$str = trim($str, '-');
$str = preg_replace('/[^a-z0-9]+/', '-', $str);

或者用一个紧凑的单行：

$str = strtolower(trim(preg_replace('/[^a-zA-Z0-9]+/', '-', $str), '-'));

Since you seem to want all sequences of non-alphanumeric characters being replaced by a single hyphen, you can use this:

$str = preg_replace('/[^a-zA-Z0-9]+/', '-', $str);

But this can result in leading or trailing hyphens that can be removed with trim:

$str = trim($str, '-');

And to convert the result into lowercase, use strtolower:

$str = strtolower($str);

So all together:

$str = strtolower($str);
$str = trim($str, '-');
$str = preg_replace('/[^a-z0-9]+/', '-', $str);

Or in a compact one-liner:

$str = strtolower(trim(preg_replace('/[^a-zA-Z0-9]+/', '-', $str), '-'));

回复收藏 0 原文

疯了 2024-10-06 22:47:43

我正在处理类似的事情，我想出了这段小代码，它还考虑了拉丁字符的使用。

这是示例字符串：

$str = 'El veloz murciélagohindu comía fe&@#$%&!"#%&?¡?*liz cardillo y kiwi。La cigüeña ¡ ;.-|°Øtocaba el saxofón detrás del palmenque de paja';

首先，我将字符串转换为 htmlentities，以便稍后使用。

$friendURL = htmlentities($str, ENT_COMPAT, " UTF-8", false);

然后我用相应的 ascii 字符替换拉丁字符（á 变为 a，Ü 变为U 等）：

$FriendlyURL = preg_replace('/&([az]{1,2})(?:acute|circ|lig|grave|ring|tilde |uml|cedil|caron);/i','\1',$FriendlyURL);

然后，我将字符串从 html 实体转换回符号，以便稍后使用

$FriendlyURL = html_entity_decode 。 ($FriendlyURL,ENT_COMPAT, "UTF-8");

接下来，我将所有非字母数字字符替换为连字符

$FriendlyURL = preg_replace('/[^a-z0-9-]+/i' , '-', $FriendlyURL);

我删除了字符串中多余的连字符：

$FriendlyURL = preg_replace('/-+/', '-', $FriendlyURL);

我删除了前导和尾随连字符：

$FriendlyURL = TRIM($FriendlyURL, '-');

最后全部转换为小写：

$FriendlyURL = strtolower($FriendlyURL);

全部一起：

function friendlyUrl ($str = '') {

    $friendlyURL = htmlentities($str, ENT_COMPAT, "UTF-8", false); 
    $friendlyURL = preg_replace('/&([a-z]{1,2})(?:acute|circ|lig|grave|ring|tilde|uml|cedil|caron);/i','\1',$friendlyURL);
    $friendlyURL = html_entity_decode($friendlyURL,ENT_COMPAT, "UTF-8"); 
    $friendlyURL = preg_replace('/[^a-z0-9-]+/i', '-', $friendlyURL);
    $friendlyURL = preg_replace('/-+/', '-', $friendlyURL);
    $friendlyURL = trim($friendlyURL, '-');
    $friendlyURL = strtolower($friendlyURL);
    return $friendlyURL;

}

测试：

$str = 'El veloz murciélago hindú comía fe<!>&@#$%&!"#%&-?¡?*-liz cardillo y kiwi. La cigüeña ¨^`;.-|°¬tocaba el saxofón detrás del palenque de paja';

echo friendlyUrl($str);

结果：

el-veloz-murcielago-hindu-comia-fe-liz-cardillo-y-kiwi-la-ciguena-tocaba-el-saxofon-detras-del-palenque-de-paja

我想Gumbo的答案更适合你的问题，而且代码更短，但我认为它对其他人有用。

干杯，
阿德里安娜

I was just working with something similar, and I came up with this little piece of code, it also contemplates the use of latin characters.

This is the sample string:

$str = 'El veloz murciélago hindú comía fe<!>&@#$%&!"#%&?¡?*liz cardillo y kiwi. La cigüeña ¨^;.-|°¬tocaba el saxofón detrás del palenque de paja';

First I convert the string to htmlentities just to make it easier to use later.

$friendlyURL = htmlentities($str, ENT_COMPAT, "UTF-8", false);

Then I replace latin characters with their corresponding ascii characters (á becomes a, Ü becomes U, and so on):

$friendlyURL = preg_replace('/&([a-z]{1,2})(?:acute|circ|lig|grave|ring|tilde|uml|cedil|caron);/i','\1',$friendlyURL);

Then I convert the string back from html entities to symbols, again for easier use later.

$friendlyURL = html_entity_decode($friendlyURL,ENT_COMPAT, "UTF-8");

Next I replace all non alphanumeric characters into hyphens.

$friendlyURL = preg_replace('/[^a-z0-9-]+/i', '-', $friendlyURL);

I remove extra hyphens inside the string:

$friendlyURL = preg_replace('/-+/', '-', $friendlyURL);

I remove leading and trailing hyphens:

$friendlyURL = trim($friendlyURL, '-');

And finally convert all into lowercase:

$friendlyURL = strtolower($friendlyURL);

All together:

function friendlyUrl ($str = '') {

    $friendlyURL = htmlentities($str, ENT_COMPAT, "UTF-8", false); 
    $friendlyURL = preg_replace('/&([a-z]{1,2})(?:acute|circ|lig|grave|ring|tilde|uml|cedil|caron);/i','\1',$friendlyURL);
    $friendlyURL = html_entity_decode($friendlyURL,ENT_COMPAT, "UTF-8"); 
    $friendlyURL = preg_replace('/[^a-z0-9-]+/i', '-', $friendlyURL);
    $friendlyURL = preg_replace('/-+/', '-', $friendlyURL);
    $friendlyURL = trim($friendlyURL, '-');
    $friendlyURL = strtolower($friendlyURL);
    return $friendlyURL;

}

Test:

$str = 'El veloz murciélago hindú comía fe<!>&@#$%&!"#%&-?¡?*-liz cardillo y kiwi. La cigüeña ¨^`;.-|°¬tocaba el saxofón detrás del palenque de paja';

echo friendlyUrl($str);

Outcome:

el-veloz-murcielago-hindu-comia-fe-liz-cardillo-y-kiwi-la-ciguena-tocaba-el-saxofon-detras-del-palenque-de-paja

I guess Gumbo's answer fits your problem better, and it's a shorter code, but I thought it would be useful for others.

Cheers,
Adriana

回复收藏 0 原文

○闲身 2024-10-06 22:47:43

在函数中：

function sanitize_text_for_urls ($str) 
{
    return trim( strtolower( preg_replace(
        array('/[^a-z0-9-\s]/ui', '/\s/', '/-+/'),
        array('', '-', '-'),
        iconv('UTF-8', 'ASCII//TRANSLIT', $str) )), '-');
}

它的作用：

// Solve accents and diacritics
$str = iconv('UTF-8', 'ASCII//TRANSLIT', $str);

// Leave only alphanumeric (respect existing hyphens)
$str = preg_replace('/[^a-z0-9-\s]/ui', '', $str);

// Turn spaces to hyphens
$str = preg_replace('/\s+/', '-', $str);

// Remove duplicate hyphens
$str = preg_replace('/-+/', '-', $str);

// Remove trailing hyphens
$str = trim($str, '-');

// Turn to lowercase
$str = strtolower($str);

注意：
您可以通过传递数组来组合多个 preg_replace。请参阅顶部的功能。

例如：

// Électricité, plâtrerie    -->  electricite-platrerie
// St. Lücie-Pétêrès         -->  st-lucie-peteres
// -Façade- & gros œuvre     -->  facade-gros-oeuvre

// _-Thè quîck ~`!@#&$%^ &*()_+= ---{}|][ :"; <>?.,/ fóx - jümpëd_-
// the-quick-fox-jumped

编辑：在正则表达式末尾添加“/u”以使用 UTF8
编辑：考虑了重复和前导/尾随连字符，感谢@LuBre

In a function:

function sanitize_text_for_urls ($str) 
{
    return trim( strtolower( preg_replace(
        array('/[^a-z0-9-\s]/ui', '/\s/', '/-+/'),
        array('', '-', '-'),
        iconv('UTF-8', 'ASCII//TRANSLIT', $str) )), '-');
}

What it does:

// Solve accents and diacritics
$str = iconv('UTF-8', 'ASCII//TRANSLIT', $str);

// Leave only alphanumeric (respect existing hyphens)
$str = preg_replace('/[^a-z0-9-\s]/ui', '', $str);

// Turn spaces to hyphens
$str = preg_replace('/\s+/', '-', $str);

// Remove duplicate hyphens
$str = preg_replace('/-+/', '-', $str);

// Remove trailing hyphens
$str = trim($str, '-');

// Turn to lowercase
$str = strtolower($str);

Note:
You can combine multiple preg_replace by passing an array. See the function at the top.

For example:

// Électricité, plâtrerie    -->  electricite-platrerie
// St. Lücie-Pétêrès         -->  st-lucie-peteres
// -Façade- & gros œuvre     -->  facade-gros-oeuvre

// _-Thè quîck ~`!@#&$%^ &*()_+= ---{}|][ :"; <>?.,/ fóx - jümpëd_-
// the-quick-fox-jumped

EDIT: added "/u" at the end of the regex to use UTF8
EDIT: accounted for duplicated and leading/trailing hyphens, thanks to @LuBre

回复收藏 0 原文

别在捏我脸啦 2024-10-06 22:47:43

如果您在 PHP 中使用它作为文件名，Gumbo 的答案将是

$str = preg_replace('/[^a-zA-Z0-9.]+/', '-', $str);
$str = trim($str, '-');
$str = strtolower($str);

为文件名添加句点，它是 strtolower()，而不是 strtolowercase()。

If you're using this for filenames in PHP, the answer by Gumbo would be

$str = preg_replace('/[^a-zA-Z0-9.]+/', '-', $str);
$str = trim($str, '-');
$str = strtolower($str);

Added a period for file names and it's strtolower(), not strtolowercase().

回复收藏 0 原文

长梦不多时 2024-10-06 22:47:43

$str = preg_replace('/[^a-zA-Z0-9]/', '-', $str);

$str = preg_replace('/[^a-zA-Z0-9]/', '-', $str);

回复收藏 0 原文

~没有更多了~

关于作者

方觉久

暂无简介

文章

27 人气

关注发私信

友情链接

文江博客

正则表达式 - 任何文本到 URL 友好的表达式

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

诺曦

要走干脆点

把回忆走一遍

陌上青苔

Arthur

哄哄

友情链接

正则表达式 - 任何文本到 URL 友好的表达式

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（5）

关于作者

相关话题

热门标签

推荐作者

诺曦

要走干脆点

把回忆走一遍

陌上青苔

Arthur

哄哄

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。