如何从字符串创建 SEO 友好的破折号分隔网址?

发布于 2024-07-12 02:06:13 字数 470 浏览 8 评论 0 原文

取一个字符串,例如:

在 C# 中:如何在逗号分隔的字符串列表中的字符串周围添加“引号”?

并将其转换为:

in-c-how-do-i-add-quotes-around-string-in-a-comma-delimited-list-of-strings

要求:

  • 用破折号分隔每个单词并删除所有标点符号(考虑到并非所有单词用空格分隔。)
  • 函数接受最大长度,并获取低于该最大长度的所有标记。 示例:ToSeoFriendly("hello world hello world", 14) 返回 "hello-world"
  • 所有单词都转换为小写。

另外,是否应该有最小长度?

Take a string such as:

In C#: How do I add "Quotes" around string in a comma delimited list of strings?

and convert it to:

in-c-how-do-i-add-quotes-around-string-in-a-comma-delimited-list-of-strings

Requirements:

  • Separate each word by a dash and remove all punctuation (taking into account not all words are separated by spaces.)
  • Function takes in a max length, and gets all tokens below that max length. Example: ToSeoFriendly("hello world hello world", 14) returns "hello-world"
  • All words are converted to lower case.

On a separate note, should there be a minimum length?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

海风掠过北极光 2024-07-19 02:06:13

这是我的 C# 解决方案

private string ToSeoFriendly(string title, int maxLength) {
    var match = Regex.Match(title.ToLower(), "[\\w]+");
    StringBuilder result = new StringBuilder("");
    bool maxLengthHit = false;
    while (match.Success && !maxLengthHit) {
        if (result.Length + match.Value.Length <= maxLength) {
            result.Append(match.Value + "-");
        } else {
            maxLengthHit = true;
            // Handle a situation where there is only one word and it is greater than the max length.
            if (result.Length == 0) result.Append(match.Value.Substring(0, maxLength));
        }
        match = match.NextMatch();
    }
    // Remove trailing '-'
    if (result[result.Length - 1] == '-') result.Remove(result.Length - 1, 1);
    return result.ToString();
}

Here is my solution in C#

private string ToSeoFriendly(string title, int maxLength) {
    var match = Regex.Match(title.ToLower(), "[\\w]+");
    StringBuilder result = new StringBuilder("");
    bool maxLengthHit = false;
    while (match.Success && !maxLengthHit) {
        if (result.Length + match.Value.Length <= maxLength) {
            result.Append(match.Value + "-");
        } else {
            maxLengthHit = true;
            // Handle a situation where there is only one word and it is greater than the max length.
            if (result.Length == 0) result.Append(match.Value.Substring(0, maxLength));
        }
        match = match.NextMatch();
    }
    // Remove trailing '-'
    if (result[result.Length - 1] == '-') result.Remove(result.Length - 1, 1);
    return result.ToString();
}
美人如玉 2024-07-19 02:06:13

我将按照以下步骤操作:

  1. 将字符串转换为小写
  2. 用连字符替换不需要的字符
  3. 用一个连字符替换多个连字符 (不需要,因为已经调用了 preg_replace() 函数防止多个连字符)
  4. 如有必要,删除开头和结尾的连字符
  5. 如果需要,则修剪从位置 x 之前的最后一个连字符到末尾

所以,所有这些都在一个函数中(PHP):

function generateUrlSlug($string, $maxlen=0)
{
    $string = trim(preg_replace('/[^a-z0-9]+/', '-', strtolower($string)), '-');
    if ($maxlen && strlen($string) > $maxlen) {
        $string = substr($string, 0, $maxlen);
        $pos = strrpos($string, '-');
        if ($pos > 0) {
            $string = substr($string, 0, $pos);
        }
    }
    return $string;
}

I would follow these steps:

  1. convert string to lower case
  2. replace unwanted characters by hyphens
  3. replace multiple hyphens by one hyphen (not necessary as the preg_replace() function call already prevents multiple hyphens)
  4. remove hypens at the begin and end if necessary
  5. trim if needed from the last hyphen before position x to the end

So, all together in a function (PHP):

function generateUrlSlug($string, $maxlen=0)
{
    $string = trim(preg_replace('/[^a-z0-9]+/', '-', strtolower($string)), '-');
    if ($maxlen && strlen($string) > $maxlen) {
        $string = substr($string, 0, $maxlen);
        $pos = strrpos($string, '-');
        if ($pos > 0) {
            $string = substr($string, 0, $pos);
        }
    }
    return $string;
}
彻夜缠绵 2024-07-19 02:06:13

C#

public string toFriendly(string subject)
{
    subject = subject.Trim().ToLower();
    subject = Regex.Replace(subject, @"\s+", "-");
    subject = Regex.Replace(subject, @"[^A-Za-z0-9_-]", "");
    return subject;
}

C#

public string toFriendly(string subject)
{
    subject = subject.Trim().ToLower();
    subject = Regex.Replace(subject, @"\s+", "-");
    subject = Regex.Replace(subject, @"[^A-Za-z0-9_-]", "");
    return subject;
}
策马西风 2024-07-19 02:06:13

这是 php 的解决方案:

function make_uri($input, $max_length) {
  if (function_exists('iconv')) {  
    $input = @iconv('UTF-8', 'ASCII//TRANSLIT', $input);  
  }

  $lower = strtolower($input);


  $without_special = preg_replace_all('/[^a-z0-9 ]/', '', $input);
  $tokens = preg_split('/ +/', $without_special);

  $result = '';

  for ($tokens as $token) {
    if (strlen($result.'-'.$token) > $max_length+1) {
      break;
    }

    $result .= '-'.$token;       
  }

  return substr($result, 1);
}

用法:

echo make_uri('In C#: How do I add "Quotes" around string in a ...', 500);

除非您需要 uri 可输入,否则它们不需要很小。 但您应该指定一个最大值,以便 url 能够与代理等一起正常工作。

Here's a solution for php:

function make_uri($input, $max_length) {
  if (function_exists('iconv')) {  
    $input = @iconv('UTF-8', 'ASCII//TRANSLIT', $input);  
  }

  $lower = strtolower($input);


  $without_special = preg_replace_all('/[^a-z0-9 ]/', '', $input);
  $tokens = preg_split('/ +/', $without_special);

  $result = '';

  for ($tokens as $token) {
    if (strlen($result.'-'.$token) > $max_length+1) {
      break;
    }

    $result .= '-'.$token;       
  }

  return substr($result, 1);
}

usage:

echo make_uri('In C#: How do I add "Quotes" around string in a ...', 500);

Unless you need the uris to be typable, they don't need to be small. But you should specify a maximum so that the urls work well with proxies etc.

心凉怎暖 2024-07-19 02:06:13

更好的版本:

function Slugify($string)
{
    return strtolower(trim(preg_replace(array('~[^0-9a-z]~i', '~-+~'), '-', $string), '-'));
}

A better version:

function Slugify($string)
{
    return strtolower(trim(preg_replace(array('~[^0-9a-z]~i', '~-+~'), '-', $string), '-'));
}
め可乐爱微笑 2024-07-19 02:06:13

Perl 中的解决方案:

my $input = 'In C#: How do I add "Quotes" around string in a comma delimited list of strings?';

my $length = 20;
$input =~ s/[^a-z0-9]+/-/gi;
$input =~ s/^(.{1,$length}).*/\L$1/;

print "$input\n";

完成。

Solution in Perl:

my $input = 'In C#: How do I add "Quotes" around string in a comma delimited list of strings?';

my $length = 20;
$input =~ s/[^a-z0-9]+/-/gi;
$input =~ s/^(.{1,$length}).*/\L$1/;

print "$input\n";

done.

梦罢 2024-07-19 02:06:13

shell中的解决方案:

echo 'In C#: How do I add "Quotes" around string in a comma delimited list of strings?' | \
    tr A-Z a-z | \
    sed 's/[^a-z0-9]\+/-/g;s/^\(.\{1,20\}\).*/\1/'

Solution in shell:

echo 'In C#: How do I add "Quotes" around string in a comma delimited list of strings?' | \
    tr A-Z a-z | \
    sed 's/[^a-z0-9]\+/-/g;s/^\(.\{1,20\}\).*/\1/'
审判长 2024-07-19 02:06:13

这与 Stack Overflow 生成 slugs 的方式很接近:

public static string GenerateSlug(string title)
{
    string slug = title.ToLower();
    if (slug.Length > 81)
      slug = slug.Substring(0, 81);
    slug = Regex.Replace(slug, @"[^a-z0-9\-_\./\\ ]+", "");
    slug = Regex.Replace(slug, @"[^a-z0-9]+", "-");

    if (slug[slug.Length - 1] == '-')
      slug = slug.Remove(slug.Length - 1, 1);
    return slug;
}

This is close to how Stack Overflow generates slugs:

public static string GenerateSlug(string title)
{
    string slug = title.ToLower();
    if (slug.Length > 81)
      slug = slug.Substring(0, 81);
    slug = Regex.Replace(slug, @"[^a-z0-9\-_\./\\ ]+", "");
    slug = Regex.Replace(slug, @"[^a-z0-9]+", "-");

    if (slug[slug.Length - 1] == '-')
      slug = slug.Remove(slug.Length - 1, 1);
    return slug;
}
雨落□心尘 2024-07-19 02:06:13

为此,我们需要:

  1. 规范化文本
  2. 删除所有变音符号
  3. 替换国际字符
  4. 能够缩短文本以匹配 SEO 阈值

我想要一个函数来生成整个字符串,并输入可能的最大长度,这就是结果。

public static class StringHelper
{
/// <summary>
/// Creates a URL And SEO friendly slug
/// </summary>
/// <param name="text">Text to slugify</param>
/// <param name="maxLength">Max length of slug</param>
/// <returns>URL and SEO friendly string</returns>
public static string UrlFriendly(string text, int maxLength = 0)
{
    // Return empty value if text is null
    if (text == null) return "";

    var normalizedString = text
        // Make lowercase
        .ToLowerInvariant()
        // Normalize the text
        .Normalize(NormalizationForm.FormD);

    var stringBuilder = new StringBuilder();
    var stringLength = normalizedString.Length;
    var prevdash = false;
    var trueLength = 0;

    char c;

    for (int i = 0; i < stringLength; i++)
    {
        c = normalizedString[i];

        switch (CharUnicodeInfo.GetUnicodeCategory(c))
        {
            // Check if the character is a letter or a digit if the character is a
            // international character remap it to an ascii valid character
            case UnicodeCategory.LowercaseLetter:
            case UnicodeCategory.UppercaseLetter:
            case UnicodeCategory.DecimalDigitNumber:
                if (c < 128)
                    stringBuilder.Append(c);
                else
                    stringBuilder.Append(ConstHelper.RemapInternationalCharToAscii(c));

                prevdash = false;
                trueLength = stringBuilder.Length;
                break;

            // Check if the character is to be replaced by a hyphen but only if the last character wasn't
            case UnicodeCategory.SpaceSeparator:
            case UnicodeCategory.ConnectorPunctuation:
            case UnicodeCategory.DashPunctuation:
            case UnicodeCategory.OtherPunctuation:
            case UnicodeCategory.MathSymbol:
                if (!prevdash)
                {
                    stringBuilder.Append('-');
                    prevdash = true;
                    trueLength = stringBuilder.Length;
                }
                break;
        }

        // If we are at max length, stop parsing
        if (maxLength > 0 && trueLength >= maxLength)
            break;
    }

    // Trim excess hyphens
    var result = stringBuilder.ToString().Trim('-');

    // Remove any excess character to meet maxlength criteria
    return maxLength <= 0 || result.Length <= maxLength ? result : result.Substring(0, maxLength);
}
}

该助手用于将一些国际字符重新映射为可读字符。

public static class ConstHelper
{
/// <summary>
/// Remaps international characters to ascii compatible ones
/// based of: https://meta.stackexchange.com/questions/7435/non-us-ascii-characters-dropped-from-full-profile-url/7696#7696
/// </summary>
/// <param name="c">Charcter to remap</param>
/// <returns>Remapped character</returns>
public static string RemapInternationalCharToAscii(char c)
{
    string s = c.ToString().ToLowerInvariant();
    if ("àåáâäãåą".Contains(s))
    {
        return "a";
    }
    else if ("èéêëę".Contains(s))
    {
        return "e";
    }
    else if ("ìíîïı".Contains(s))
    {
        return "i";
    }
    else if ("òóôõöøőð".Contains(s))
    {
        return "o";
    }
    else if ("ùúûüŭů".Contains(s))
    {
        return "u";
    }
    else if ("çćčĉ".Contains(s))
    {
        return "c";
    }
    else if ("żźž".Contains(s))
    {
        return "z";
    }
    else if ("śşšŝ".Contains(s))
    {
        return "s";
    }
    else if ("ñń".Contains(s))
    {
        return "n";
    }
    else if ("ýÿ".Contains(s))
    {
        return "y";
    }
    else if ("ğĝ".Contains(s))
    {
        return "g";
    }
    else if (c == 'ř')
    {
        return "r";
    }
    else if (c == 'ł')
    {
        return "l";
    }
    else if (c == 'đ')
    {
        return "d";
    }
    else if (c == 'ß')
    {
        return "ss";
    }
    else if (c == 'þ')
    {
        return "th";
    }
    else if (c == 'ĥ')
    {
        return "h";
    }
    else if (c == 'ĵ')
    {
        return "j";
    }
    else
    {
        return "";
    }
}
}

该函数会像这样工作

const string text = "ICH MUß EINIGE CRÈME BRÛLÉE HABEN";
Console.WriteLine(StringHelper.URLFriendly(text));
// Output: 
// ich-muss-einige-creme-brulee-haben

这个问题已经被回答了很多次 这里但没有一个被优化。
您可以在 github 上找到完整的源代码以及一些示例。
您可以从 Johan Boström 的博客。 更多信息与 .NET 4.5+ 和 .NET Core 兼容。

To do this we need to:

  1. Normalize the text
  2. Remove all diacritics
  3. Replace international character
  4. Be able to shorten text to match SEO thresholds

I wanted a function to generate the entire string and also to have an input for a possible max length, this was the result.

public static class StringHelper
{
/// <summary>
/// Creates a URL And SEO friendly slug
/// </summary>
/// <param name="text">Text to slugify</param>
/// <param name="maxLength">Max length of slug</param>
/// <returns>URL and SEO friendly string</returns>
public static string UrlFriendly(string text, int maxLength = 0)
{
    // Return empty value if text is null
    if (text == null) return "";

    var normalizedString = text
        // Make lowercase
        .ToLowerInvariant()
        // Normalize the text
        .Normalize(NormalizationForm.FormD);

    var stringBuilder = new StringBuilder();
    var stringLength = normalizedString.Length;
    var prevdash = false;
    var trueLength = 0;

    char c;

    for (int i = 0; i < stringLength; i++)
    {
        c = normalizedString[i];

        switch (CharUnicodeInfo.GetUnicodeCategory(c))
        {
            // Check if the character is a letter or a digit if the character is a
            // international character remap it to an ascii valid character
            case UnicodeCategory.LowercaseLetter:
            case UnicodeCategory.UppercaseLetter:
            case UnicodeCategory.DecimalDigitNumber:
                if (c < 128)
                    stringBuilder.Append(c);
                else
                    stringBuilder.Append(ConstHelper.RemapInternationalCharToAscii(c));

                prevdash = false;
                trueLength = stringBuilder.Length;
                break;

            // Check if the character is to be replaced by a hyphen but only if the last character wasn't
            case UnicodeCategory.SpaceSeparator:
            case UnicodeCategory.ConnectorPunctuation:
            case UnicodeCategory.DashPunctuation:
            case UnicodeCategory.OtherPunctuation:
            case UnicodeCategory.MathSymbol:
                if (!prevdash)
                {
                    stringBuilder.Append('-');
                    prevdash = true;
                    trueLength = stringBuilder.Length;
                }
                break;
        }

        // If we are at max length, stop parsing
        if (maxLength > 0 && trueLength >= maxLength)
            break;
    }

    // Trim excess hyphens
    var result = stringBuilder.ToString().Trim('-');

    // Remove any excess character to meet maxlength criteria
    return maxLength <= 0 || result.Length <= maxLength ? result : result.Substring(0, maxLength);
}
}

This helper is used for remapping some international characters to a readable one instead.

public static class ConstHelper
{
/// <summary>
/// Remaps international characters to ascii compatible ones
/// based of: https://meta.stackexchange.com/questions/7435/non-us-ascii-characters-dropped-from-full-profile-url/7696#7696
/// </summary>
/// <param name="c">Charcter to remap</param>
/// <returns>Remapped character</returns>
public static string RemapInternationalCharToAscii(char c)
{
    string s = c.ToString().ToLowerInvariant();
    if ("àåáâäãåą".Contains(s))
    {
        return "a";
    }
    else if ("èéêëę".Contains(s))
    {
        return "e";
    }
    else if ("ìíîïı".Contains(s))
    {
        return "i";
    }
    else if ("òóôõöøőð".Contains(s))
    {
        return "o";
    }
    else if ("ùúûüŭů".Contains(s))
    {
        return "u";
    }
    else if ("çćčĉ".Contains(s))
    {
        return "c";
    }
    else if ("żźž".Contains(s))
    {
        return "z";
    }
    else if ("śşšŝ".Contains(s))
    {
        return "s";
    }
    else if ("ñń".Contains(s))
    {
        return "n";
    }
    else if ("ýÿ".Contains(s))
    {
        return "y";
    }
    else if ("ğĝ".Contains(s))
    {
        return "g";
    }
    else if (c == 'ř')
    {
        return "r";
    }
    else if (c == 'ł')
    {
        return "l";
    }
    else if (c == 'đ')
    {
        return "d";
    }
    else if (c == 'ß')
    {
        return "ss";
    }
    else if (c == 'þ')
    {
        return "th";
    }
    else if (c == 'ĥ')
    {
        return "h";
    }
    else if (c == 'ĵ')
    {
        return "j";
    }
    else
    {
        return "";
    }
}
}

To the function would work something like this

const string text = "ICH MUß EINIGE CRÈME BRÛLÉE HABEN";
Console.WriteLine(StringHelper.URLFriendly(text));
// Output: 
// ich-muss-einige-creme-brulee-haben

This question has already been answered many time here but not a single one was optimized.
you can find the entire sourcecode here on github with some samples.
More you can read from Johan Boström's Blog. More on this is compatible with .NET 4.5+ and .NET Core.

相权↑美人 2024-07-19 02:06:13

在 PHP 中执行此操作的一种稍微简洁的方法至少是:

function CleanForUrl($urlPart, $maxLength = null) {
    $url = strtolower(preg_replace(array('/[^a-z0-9\- ]/i', '/[ \-]+/'), array('', '-'), trim($urlPart)));
    if ($maxLength) $url = substr($url, 0, $maxLength);
    return $url;
}

最好在开始时执行 trim() ,这样以后需要处理的内容就会减少,并且完全替换是在 中完成的preg_replace()。

感谢 cg 提出了大部分内容:清理 URL 中的字符串(例如 SO 上的问题名称)的最佳方法是什么?

A slightly cleaner way of doing this in PHP at least is:

function CleanForUrl($urlPart, $maxLength = null) {
    $url = strtolower(preg_replace(array('/[^a-z0-9\- ]/i', '/[ \-]+/'), array('', '-'), trim($urlPart)));
    if ($maxLength) $url = substr($url, 0, $maxLength);
    return $url;
}

Might as well do the trim() at the start so there is less to process later and the full replacement is done with in the preg_replace().

Thxs to cg for coming up with most of this: What is the best way to clean a string for placement in a URL, like the question name on SO?

丶情人眼里出诗心の 2024-07-19 02:06:13

选择 Ruby 的另一个季节,另一个原因:)

def seo_friendly(str)
  str.strip.downcase.gsub /\W+/, '-'
end

仅此而已。

Another season, another reason, for choosing Ruby :)

def seo_friendly(str)
  str.strip.downcase.gsub /\W+/, '-'
end

That's all.

三月梨花 2024-07-19 02:06:13

在 python 中,(如果安装了 django,即使您使用的是其他框架。)

from django.template.defaultfilters import slugify
slugify("In C#: How do I add "Quotes" around string in a comma delimited list of strings?")

In python, (if django is installed, even if you are using another framework.)

from django.template.defaultfilters import slugify
slugify("In C#: How do I add "Quotes" around string in a comma delimited list of strings?")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文