将制表符和空格替换为单个空格,将回车符和换行符替换为单个换行符

发布于 2024-11-15 19:27:25 字数 425 浏览 3 评论 0原文

$string = "My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs";

echo preg_replace("/\s\s+/", " ", $string);

我阅读了 PHP 的文档并遵循了 preg_replace() 教程,但是此代码会生成:

My text has so much whitespace Plenty of spaces and tabs

如何将其转换为:

My text has so much whitespace    
Plenty of spaces and tabs
$string = "My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs";

echo preg_replace("/\s\s+/", " ", $string);

I read the PHP's documentation and followed the preg_replace() tutorial, however this code produces:

My text has so much whitespace Plenty of spaces and tabs

How can I turn it into :

My text has so much whitespace    
Plenty of spaces and tabs

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

美男兮 2024-11-22 19:27:25

首先,我想指出,新行可以是 \r、\n 或 \r\n,具体取决于操作系统。

我的解决方案:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/[\r\n]+/', "\n", $string));

如果需要,可以将其分成两行:

$string = preg_replace('/[\r\n]+/', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

更新

更好的解决方案是这样的:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/\s*$^\s*/m', "\n", $string));

或者:

$string = preg_replace('/\s*$^\s*/m', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

我已经更改了正则表达式,使多行更好地分解为一行。它使用“m”修饰符(使 ^ 和 $ 匹配新行的开头和结尾)并删除作为字符串结尾和开头的任何 \s(空格、制表符、新行、换行符)字符下一个。这解决了除了空格之外什么都没有的空行问题。在我之前的例子中,如果一行充满了空格,它就会跳过额外的一行。

First, I'd like to point out that new lines can be either \r, \n, or \r\n depending on the operating system.

My solution:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/[\r\n]+/', "\n", $string));

Which could be separated into 2 lines if necessary:

$string = preg_replace('/[\r\n]+/', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

Update:

An even better solutions would be this one:

echo preg_replace('/[ \t]+/', ' ', preg_replace('/\s*$^\s*/m', "\n", $string));

Or:

$string = preg_replace('/\s*$^\s*/m', "\n", $string);
echo preg_replace('/[ \t]+/', ' ', $string);

I've changed the regular expression that makes multiple lines breaks into a single better. It uses the "m" modifier (which makes ^ and $ match the start and end of new lines) and removes any \s (space, tab, new line, line break) characters that are a the end of a string and the beginning of the next. This solve the problem of empty lines that have nothing but spaces. With my previous example, if a line was filled with spaces, it would have skipped an extra line.

残疾 2024-11-22 19:27:25

编辑了正确答案。从 PHP 5.2.4 左右开始,可以使用以下代码:

echo preg_replace('/\v(?:[\v\h]+)/', '', $string);

Edited the right answer. From PHP 5.2.4 or so, the following code will do:

echo preg_replace('/\v(?:[\v\h]+)/', '', $string);
池木 2024-11-22 19:27:25

替换多个换行符、制表符、空格

$text = preg_replace("/[\r\n]+/", "\n", $text);
$text = preg_replace("/\s+/", ' ', $text);

已测试:)

Replace Multiple Newline, Tab, Space

$text = preg_replace("/[\r\n]+/", "\n", $text);
$text = preg_replace("/\s+/", ' ', $text);

Tested :)

牵强ㄟ 2024-11-22 19:27:25
//Newline and tab space to single space

$from_mysql = str_replace(array("\r\n", "\r", "\n", "\t"), ' ', $from_mysql);


// Multiple spaces to single space ( using regular expression)

$from_mysql = ereg_replace(" {2,}", ' ',$from_mysql);

// Replaces 2 or more spaces with a single space, {2,} indicates that you are looking for 2 or more than 2 spaces in a string.
//Newline and tab space to single space

$from_mysql = str_replace(array("\r\n", "\r", "\n", "\t"), ' ', $from_mysql);


// Multiple spaces to single space ( using regular expression)

$from_mysql = ereg_replace(" {2,}", ' ',$from_mysql);

// Replaces 2 or more spaces with a single space, {2,} indicates that you are looking for 2 or more than 2 spaces in a string.
任性一次 2024-11-22 19:27:25

这将完全缩小整个字符串(例如一篇大型博客文章),同时保留所有 HTML 标签。

$email_body = str_replace(PHP_EOL, ' ', $email_body);
    //PHP_EOL = PHP_End_Of_Line - would remove new lines too
$email_body = preg_replace('/[\r\n]+/', "\n", $email_body);
$email_body = preg_replace('/[ \t]+/', ' ', $email_body);

this would COMPLETELY MINIFY the entire string (such as a large blog article) yet preserving all HTML tags in place.

$email_body = str_replace(PHP_EOL, ' ', $email_body);
    //PHP_EOL = PHP_End_Of_Line - would remove new lines too
$email_body = preg_replace('/[\r\n]+/', "\n", $email_body);
$email_body = preg_replace('/[ \t]+/', ' ', $email_body);
陈甜 2024-11-22 19:27:25

替代方法:

echo preg_replace_callback("/\s+/", function ($match) {
    $result = array();
    $prev = null;
    foreach (str_split($match[0], 1) as $char) {
        if ($prev === null || $char != $prev) {
            $result[] = $char;
        }

        $prev = $char;
    }

    return implode('', $result);
}, $string);

输出

My text has so much whitespace
Plenty of spaces and tabs

编辑:阅读此内容,因为它是一种不同的方法。这可能不是所要求的,但它至少不会合并不同空白的组(例如 space, tab, tab, space, nl, nl, space, space 会变成 space, tab ,空格,nl,空格)。

Alternative approach:

echo preg_replace_callback("/\s+/", function ($match) {
    $result = array();
    $prev = null;
    foreach (str_split($match[0], 1) as $char) {
        if ($prev === null || $char != $prev) {
            $result[] = $char;
        }

        $prev = $char;
    }

    return implode('', $result);
}, $string);

Output:

My text has so much whitespace
Plenty of spaces and tabs

Edit: Readded this for it being a different approach. It's probably not what's asked for, but it will at least not merge groups of different whitespace (e.g. space, tab, tab, space, nl, nl, space, space would become space, tab, space, nl, space).

烟─花易冷 2024-11-22 19:27:25

将回显数据从 PHP 传递到 Javascript(格式为 JSON)时遇到同样的问题。该字符串中充满了无用的 \r\n 和 \t 字符,这些字符既不需要也不显示在页面上。

我最终使用的解决方案是另一种回声方式。与 preg_replace 相比,这节省了大量服务器资源(正如其他人在这里建议的那样)。


这里是之前和之后的比较:

之前:

echo '
<div>

    Example
    Example

</div>
';

输出:

\r\n\r\n\tExample\r\n\tExample\r\n\r\n

;


之后:

echo 
'<div>',

    'Example',
    'Example',

'</div>';

输出:

ExampleExample


(是的,您不仅可以将 echo 与点连接起来,还可以将 echo 连接起来带逗号。)

Had the same problem when passing echoed data from PHP to Javascript (formatted as JSON). The string was peppered with useless \r\n and \t characters that are neither required nor displayed on the page.

The solution i ended up using is another way of echoing. That saves a lot of server resources compared to preg_replace (as it is suggested by other people here).


Here the before and after in comparison:

Before:

echo '
<div>

    Example
    Example

</div>
';

Output:

<div>\r\n\r\n\tExample\r\n\tExample\r\n\r\n</div>


After:

echo 
'<div>',

    'Example',
    'Example',

'</div>';

Output:

<div>ExampleExample</div>


(Yes, you can concatenate echo not only with dots, but also with comma.)

霊感 2024-11-22 19:27:25

你为什么这样做?
即使您使用多个空格,html 也仅显示一个空格...

例如:

<i>test               content 1       2 3 4            5</i>

输出将是:
测试内容 1 2 3 4 5

如果您需要在 html 中使用多个空格,则必须使用  

why you are doing like this?
html displays only one space even you use more than one space...

For example:

<i>test               content 1       2 3 4            5</i>

The output willl be:
test content 1 2 3 4 5

if you need more than single space in html, you have to use  

同尘 2024-11-22 19:27:25

尝试:

$string = "My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs";
//Remove duplicate newlines
$string = preg_replace("/[\n]*/", "\n", $string); 
//Preserves newlines while replacing the other whitspaces with single space
echo preg_replace("/[ \t]*/", " ", $string); 

try with:

$string = "My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs";
//Remove duplicate newlines
$string = preg_replace("/[\n]*/", "\n", $string); 
//Preserves newlines while replacing the other whitspaces with single space
echo preg_replace("/[ \t]*/", " ", $string); 
冧九 2024-11-22 19:27:25

此任务要求将连续的空格和制表符(“水平空白”--\h)替换为单个文字空格,并且连续的回车符和换行符(“垂直空白”--\ v) 被替换为换行符。为了确保在您自己的系统中使用适当的换行符序列,请使用 PHP_EOL

匹配少至零的出现次数(使用 *)是没有意义的,因为您可能会在以前没有空白字符的地方添加空白字符。因此,此任务的模式应仅使用 + (一个或多个)量词。

如果字符串的开头或结尾有可能出现任何类型的空格,那么不必费心使用正则表达式删除它们,只需使用 trim() 即可。

在这种情况下,\R将提供与\v相同的结果,但\R走得更远并且更复杂(也许不必要)。这是一篇内容丰富的读物:https:// www.npopov.com/2011/12/10/PCRE-and-newlines.html#meet-r

代码:(演示

$string = "
My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs  ";
var_export(
    preg_replace(
        ['/\h+/', '/\v+/'],
        [' ',     PHP_EOL],
        trim($string)
    )
);

输出:

'My text has so much whitespace 
Plenty of spaces and tabs'

This task requires that consecutive spaces and tabs ("horizontal whitespaces" -- \h) be replaced with a single literal space and that consecutive carriage returns and newlines ("verticle whitespaces" -- \v) be replaced with a newline. To ensure that the appropriate newline character sequences are used within your own system, use PHP_EOL.

It makes no sense to match as few as zero of an occurrence (with *) because you would potentially be adding a whitespace character where there was previously no whitespace character. For this reason, patterns for this task should only be using the + (one or more) quantifier.

If there is any chance of any kind of whitespaces occurring at the start or end of the string, then don't bother removing them with regex, just use trim().

In this context, \R would provide the same outcome as \v, but \R goes a bit farther and is more complex (perhaps unnecessarily). This is an informative read: https://www.npopov.com/2011/12/10/PCRE-and-newlines.html#meet-r

Code: (Demo)

$string = "
My    text       has so    much   whitespace    




Plenty of    spaces  and            tabs  ";
var_export(
    preg_replace(
        ['/\h+/', '/\v+/'],
        [' ',     PHP_EOL],
        trim($string)
    )
);

Output:

'My text has so much whitespace 
Plenty of spaces and tabs'
倾`听者〃 2024-11-22 19:27:25

不确定这是否有用,我也不是绝对肯定它会像它应该的那样工作,但它似乎对我有用。

一个函数,可以清除多个空格以及您想要或不想要的任何其他内容,并生成单行字符串或多行字符串(取决于传递的参数/选项)。还可以删除或保留其他语言的字符,并将换行符转换为空格。

/** ¯\_(ツ)_/¯ Hope it's useful to someone. **/
// If $multiLine is null this removes spaces too. <options>'[:emoji:]' with $l = true allows only known emoji.
// <options>'[:print:]' with $l = true allows all utf8 printable chars (including emoji).
// **** TODO: If a unicode emoji or language char is used in $options while $l = false; we get an odd � symbol replacement for any non-matching char. $options char seems to get through, regardless of $l = false ? (bug (?)interesting)
function alphaNumericMagic($value, $options = '', $l = false, $multiLine = false, $tabSpaces = "    ") {
    $utf8Emojis = '';
    $patterns = [];
    $replacements = [];
    if ($l && preg_match("~(\[\:emoji\:\])~", $options)) {
        $utf8Emojis = [
            '\x{1F600}-\x{1F64F}', /* Emoticons */
            '\x{1F9D0}-\x{1F9E6}',
            '\x{1F300}-\x{1F5FF}', /* Misc Characters */ // \x{1F9D0}-\x{1F9E6}
            '\x{1F680}-\x{1F6FF}', /* Transport and Map */
            '\x{1F1E0}-\x{1F1FF}' /* Flags (iOS) */
        ];
        $utf8Emojis = implode('', $utf8Emojis);
    }
    $options = str_replace("[:emoji:]", $utf8Emojis, $options);
    if (!preg_match("~(\[\:graph\:\]|\[\:print\:\]|\[\:punct\:\]|\\\-)~", $options)) {
        $value = str_replace("-", ' ', $value);
    }
    if ($l) {
        $l = 'u';
        $options = $options . '\p{L}\p{N}\p{Pd}';
    } else { $l = ''; }
    if (preg_match("~(\[\:print\:\])~", $options)) {
        $patterns[] = "/[ ]+/m";
        $replacements[] = " ";
    }
    if ($multiLine) {
        $patterns[] = "/(?<!^)(?:[^\r\na-z0-9][\t]+)/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options])|[^a-z0-9$options\s]/im$l";
        $patterns[] = "/\t/m";
        $patterns[] = "/(?<!^)$tabSpaces/m";
        $replacements[] = " ";
        $replacements[] = "";
        $replacements[] = $tabSpaces;
        $replacements[] = " ";
    } else if ($multiLine === null) {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[^a-z0-9$options]/im$l";
        $replacements = "";
    } else {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options\t])|[^a-z0-9$options ]/im$l";
        $replacements[] = " ";
        $replacements[] = "";
    }
    echo "\n";
    print_r($patterns);
    echo "\n";
    echo $l;
    echo "\n";
    return preg_replace($patterns, $replacements, $value);
}

用法示例:

echo header('Content-Type: text/html; charset=utf-8', true);
$string = "fjl!sj\nfl _ sfjs-lkjf\r\n\tskj 婦女與環境健康 fsl \tklkj\thl jhj ⚧

Not sure if this will be useful nor am I absolutely positive it works like it should but it seems to be working for me.

A function that clears multiple spaces and anything else you want or don't want and produces either a single line string or a multi-line string (dependent on passed arguments/options). Can also remove or keep characters for other languages and convert newline tabs to spaces.

/** ¯\_(ツ)_/¯ Hope it's useful to someone. **/
// If $multiLine is null this removes spaces too. <options>'[:emoji:]' with $l = true allows only known emoji.
// <options>'[:print:]' with $l = true allows all utf8 printable chars (including emoji).
// **** TODO: If a unicode emoji or language char is used in $options while $l = false; we get an odd � symbol replacement for any non-matching char. $options char seems to get through, regardless of $l = false ? (bug (?)interesting)
function alphaNumericMagic($value, $options = '', $l = false, $multiLine = false, $tabSpaces = "    ") {
    $utf8Emojis = '';
    $patterns = [];
    $replacements = [];
    if ($l && preg_match("~(\[\:emoji\:\])~", $options)) {
        $utf8Emojis = [
            '\x{1F600}-\x{1F64F}', /* Emoticons */
            '\x{1F9D0}-\x{1F9E6}',
            '\x{1F300}-\x{1F5FF}', /* Misc Characters */ // \x{1F9D0}-\x{1F9E6}
            '\x{1F680}-\x{1F6FF}', /* Transport and Map */
            '\x{1F1E0}-\x{1F1FF}' /* Flags (iOS) */
        ];
        $utf8Emojis = implode('', $utf8Emojis);
    }
    $options = str_replace("[:emoji:]", $utf8Emojis, $options);
    if (!preg_match("~(\[\:graph\:\]|\[\:print\:\]|\[\:punct\:\]|\\\-)~", $options)) {
        $value = str_replace("-", ' ', $value);
    }
    if ($l) {
        $l = 'u';
        $options = $options . '\p{L}\p{N}\p{Pd}';
    } else { $l = ''; }
    if (preg_match("~(\[\:print\:\])~", $options)) {
        $patterns[] = "/[ ]+/m";
        $replacements[] = " ";
    }
    if ($multiLine) {
        $patterns[] = "/(?<!^)(?:[^\r\na-z0-9][\t]+)/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options])|[^a-z0-9$options\s]/im$l";
        $patterns[] = "/\t/m";
        $patterns[] = "/(?<!^)$tabSpaces/m";
        $replacements[] = " ";
        $replacements[] = "";
        $replacements[] = $tabSpaces;
        $replacements[] = " ";
    } else if ($multiLine === null) {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[^a-z0-9$options]/im$l";
        $replacements = "";
    } else {
        $patterns[] = "/[\r\n\t]+/m";
        $patterns[] = "/[ ]+(?![a-z0-9$options\t])|[^a-z0-9$options ]/im$l";
        $replacements[] = " ";
        $replacements[] = "";
    }
    echo "\n";
    print_r($patterns);
    echo "\n";
    echo $l;
    echo "\n";
    return preg_replace($patterns, $replacements, $value);
}

Example usage:

echo header('Content-Type: text/html; charset=utf-8', true);
$string = "fjl!sj\nfl _  sfjs-lkjf\r\n\tskj 婦女與環境健康 fsl \tklkj\thl jhj ⚧???? lkj ⸀ skjfl gwo lsjowgtfls s";
echo "<textarea style='width:100%; height:100%;'>";
echo alphaNumericMagic($string, '⚧', true, null);
echo "\n\nAND\n\n";
echo alphaNumericMagic($string, '[:print:]', true, true);
echo "</textarea>";

Results in:

fjlsjflsfjslkjfskj婦女與環境健康fslklkjhljhj⚧lkjskjflgwolsjowgtflss

AND

fjl!sj
fl _ sfjs-lkjf
    skj 婦女與環境健康 fsl klkj hl jhj ⚧???? lkj ⸀ skjfl gwo lsjowgtfls s
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文