php preg_replace_call :提取特定值以供稍后重新插入

发布于 2024-12-08 03:57:36 字数 2863 浏览 3 评论 0原文

为了简洁起见...
我想从字符串中取出项目,将它们放入一个单独的数组中,用 ID 标记替换从字符串中提取的值,解析字符串,然后将提取的项目放回到原来的位置(以正确的顺序) 。 (如果这是有意义的,则跳过其余部分:D)

我有以下字符串;
“我的句子包含 [url] 和 [url] 的 URL,这让我的生活变得困难。”

由于各种原因,我想删除这些网址。 但我需要保留它们的位置,并稍后重新插入它们(在操作字符串的其余部分之后)。

因此我想;
“我的句子包含 [url] 和 [url] 的 URL,这让我的生活变得困难。”
成为;
“我的句子包含 [token1fortheURL] 和 [token2fortheURL] 的 URL,这让我的生活变得困难。”

我已经尝试过多次以各种方式做到这一点。 我所做的就是撞砖墙并发明新的脏话!

我使用以下代码进行设置;

$mystring = 'my sentence contains URLs to http://www.google.com/this.html and http://www.yahoo.com which makes my life difficult.';
$myregex = '/(((?:https?|ftps?)\:\/\/)?([a-zA-Z0-9:]*[@])?([a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}|([0-9]+))([a-zA-Z0-9-._?,\'\/\+&%\$#\=~:]+)?)/';
$myextractions = array();

然后我执行 preg_replace_callback;

$matches = preg_replace_callback($myregex,'myfunction',$mystring);

我的职能如下;

function myfunction ($matches) {}

正是在这里,砖墙开始出现。 我可以将内容放入空白提取数组中 - 但它们在函数外部不可用。我可以使用令牌更新字符串,但无法访问被替换的 URL。 我似乎无法向 preg_replace_callback 中的函数调用添加其他值。

我希望有人能提供帮助,因为这让我发疯。


更新:

基于@Lepidosteus建议的解决方案, 我想我有以下工作?

$mystring = 'my sentence contains URLs to http://www.google.com/this.html and http://www.yahoo.com which makes my life difficult.';
$myregex = '/(((?:https?|ftps?)\:\/\/)?([a-zA-Z0-9:]*[@])?([a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}|([0-9]+))([a-zA-Z0-9-._?,\'\/\+&%\$#\=~:]+)?)/';
$tokenstart = ":URL:";
$tokenend = ":";


function extraction ($myregex, $mystring, $mymatches, $tokenstart, $tokenend) {
$test1 = preg_match_all($myregex,$mystring,$mymatches);
$mymatches = array_slice($mymatches, 0, 1);
$thematches = array();

foreach ($mymatches as $match) {
    foreach ($match as $key=>$match2) {
        $thematches[] = array($match2, $tokenstart.$key.$tokenend);
    }
}


return $thematches;
}
$matches = extraction ($myregex, $mystring, $mymatches, $tokenstart, $tokenend);
echo "1) ".$mystring."<br/>";
// 1) my sentence contains URLs to http://www.google.com/this.html and http://www.yahoo.com which makes my life difficult.



function substitute($matches,$mystring) {
foreach ($matches as $match) {
    $mystring = str_replace($match[0], $match[1], $mystring);
}
return $mystring;
}
$mystring = substitute($matches,$mystring);
echo "2) ".$mystring."<br/>";
// 2) my sentence contains URLs to :URL:0: and :URL:1: which makes my life difficult.


function reinsert($matches,$mystring) {
foreach ($matches as $match) {
    $mystring = str_replace($match[1], $match[0], $mystring);
}
return $mystring;
}
$mystring = reinsert($matches,$mystring);
echo "3) ".$mystring."<br/>";
// 3) my sentence contains URLs to http://www.google.com/this.html and http://www.yahoo.com which makes my life difficult.

这看起来有效吗?

For the sake of brevity...
I want to take items out of a string, put them into a separate array, replace the values extracted from the string with ID'd tokens, parse the string, then put the extracted items back in their original positions (in the correct order).
(If that makes sense, then skip the rest :D)

I have the following string;
"my sentence contains URLs to [url] and [url] which makes my life difficult."

For various reasons, I would like to remove the URLs.
But I need to keep their place, and reinsert them later (after manipulating the rest of the string).

Thus I would like;
"my sentence contains URLs to [url] and [url] which makes my life difficult."
to become;
"my sentence contains URLs to [token1fortheURL] and [token2fortheURL] which makes my life difficult."

I've tried doing this several times, various ways.
All I do is hit brickwalls and invent new swear words!

I use the following code to setup with;

$mystring = 'my sentence contains URLs to http://www.google.com/this.html and http://www.yahoo.com which makes my life difficult.';
$myregex = '/(((?:https?|ftps?)\:\/\/)?([a-zA-Z0-9:]*[@])?([a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}|([0-9]+))([a-zA-Z0-9-._?,\'\/\+&%\$#\=~:]+)?)/';
$myextractions = array();

I then do a preg_replace_callback;

$matches = preg_replace_callback($myregex,'myfunction',$mystring);

And I have my function as follows;

function myfunction ($matches) {}

And it's here that the brickwalls start happening.
I can put stuff into the blank extraction array - but they are nto available outside the function. I can update the string with tokens, but I lose access to the URLs that are replaced.
I cannot seem to add additional values to the function call within the preg_replace_callback.

I'm hoping someone can help, as this is driving me nuts.


UPDATE:

Based on the solution suggested by @Lepidosteus,
I think I have the following working?

$mystring = 'my sentence contains URLs to http://www.google.com/this.html and http://www.yahoo.com which makes my life difficult.';
$myregex = '/(((?:https?|ftps?)\:\/\/)?([a-zA-Z0-9:]*[@])?([a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}|([0-9]+))([a-zA-Z0-9-._?,\'\/\+&%\$#\=~:]+)?)/';
$tokenstart = ":URL:";
$tokenend = ":";


function extraction ($myregex, $mystring, $mymatches, $tokenstart, $tokenend) {
$test1 = preg_match_all($myregex,$mystring,$mymatches);
$mymatches = array_slice($mymatches, 0, 1);
$thematches = array();

foreach ($mymatches as $match) {
    foreach ($match as $key=>$match2) {
        $thematches[] = array($match2, $tokenstart.$key.$tokenend);
    }
}


return $thematches;
}
$matches = extraction ($myregex, $mystring, $mymatches, $tokenstart, $tokenend);
echo "1) ".$mystring."<br/>";
// 1) my sentence contains URLs to http://www.google.com/this.html and http://www.yahoo.com which makes my life difficult.



function substitute($matches,$mystring) {
foreach ($matches as $match) {
    $mystring = str_replace($match[0], $match[1], $mystring);
}
return $mystring;
}
$mystring = substitute($matches,$mystring);
echo "2) ".$mystring."<br/>";
// 2) my sentence contains URLs to :URL:0: and :URL:1: which makes my life difficult.


function reinsert($matches,$mystring) {
foreach ($matches as $match) {
    $mystring = str_replace($match[1], $match[0], $mystring);
}
return $mystring;
}
$mystring = reinsert($matches,$mystring);
echo "3) ".$mystring."<br/>";
// 3) my sentence contains URLs to http://www.google.com/this.html and http://www.yahoo.com which makes my life difficult.

That appears to work?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

殊姿 2024-12-15 03:57:36

解决这里问题的关键是将 url 列表存储在外部容器中,回调和主代码可以访问该列表,以便对它们进行所需的更改。为了记住您的网址位置,我们将在字符串中使用自定义标记。

请注意,如果您不能使用 php,要访问容器,我使用 闭包 5.3 出于某种原因,您需要将它们替换为另一种从回调中访问 $url_tokens 容器的方法,这应该不是问题。

<?php
// the string you start with

$string = "my sentence contains URLs to http://stackoverflow.com/questions/7619843/php-preg-replace-call-extract-specific-values-for-later-reinsertion and http://www.google.com/ which makes my life difficult.";

// the url container, you will store the urls found here

$url_tokens = array();

// the callback for the first replace, will take all urls, store them in $url_tokens, then replace them with [[URL::X]] with X being an unique number for each url
//
// note that the closure use $url_token by reference, so that we can add entries to it from inside the function

$callback = function ($matches) use (&$url_tokens) {
  static $token_iteration = 0;

  $token = '[[URL::'.$token_iteration.']]';

  $url_tokens[$token_iteration] = $matches;

  $token_iteration++;

  return $token;
};

// replace our urls with our callback

$pattern = '/(((?:https?|ftps?)\:\/\/)?([a-zA-Z0-9:]*[@])?([a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}|([0-9]+))([a-zA-Z0-9-._?,\'\/\+&%\$#\=~:]+)?)/';

$string = preg_replace_callback($pattern, $callback, $string);

// some debug code to check what we have at this point

var_dump($url_tokens);
var_dump($string);

// you can do changes to the url you found in $url_tokens here

// now we will replace our previous tokens with a specific string, just as an exemple of how to re-replace them when you're done

$callback_2 = function ($matches) use ($url_tokens) {
  $token = $matches[0];
  $token_iteration = $matches[1];

  if (!isset($url_tokens[$token_iteration])) {
    // if we don't know what this token is, leave it untouched
    return $token;
  }

  return '- there was an url to '.$url_tokens[$token_iteration][4].' here -';
};

$string = preg_replace_callback('/\[\[URL::([0-9]+)\]\]/', $callback_2, $string);

var_dump($string);

执行时给出以下结果:

// the $url_tokens array after the first preg_replace_callback
array(2) {
  [0]=>
  array(7) {
    [0]=>
    string(110) "http://stackoverflow.com/questions/7619843/php-preg-replace-call-extract-specific-values-for-later-reinsertion"
    [1]=>
    string(110) "http://stackoverflow.com/questions/7619843/php-preg-replace-call-extract-specific-values-for-later-reinsertion"
    [2]=>
    string(7) "http://"
    [3]=>
    string(0) ""
    [4]=>
    string(17) "stackoverflow.com"
    [5]=>
    string(0) ""
    [6]=>
    string(86) "/questions/7619843/php-preg-replace-call-extract-specific-values-for-later-reinsertion"
  }
  [1]=>
  array(7) {
    [0]=>
    string(22) "http://www.google.com/"
    [1]=>
    string(22) "http://www.google.com/"
    [2]=>
    string(7) "http://"
    [3]=>
    string(0) ""
    [4]=>
    string(14) "www.google.com"
    [5]=>
    string(0) ""
    [6]=>
    string(1) "/"
  }
}
// the $string after the first preg_replace_callback
string(85) "my sentence contains URLs to [[URL::0]] and [[URL::1]] which makes my life difficult."

// the $string after the second replace
string(154) "my sentence contains URLs to - there was an url to stackoverflow.com here - and - there was an url to www.google.com here - which makes my life difficult."

The key to solving your problem here is to store the urls list in an outside container than can be accessed by your callbacks and by your main code to do the changes you need on them. To remember your urls positions, we will use a custom token in the string.

Note that to access the container I use closures, if you can't use php 5.3 for some reason you will need to replace them with another way to access the $url_tokens container from within the callback, which shouldn't be a problem.

<?php
// the string you start with

$string = "my sentence contains URLs to http://stackoverflow.com/questions/7619843/php-preg-replace-call-extract-specific-values-for-later-reinsertion and http://www.google.com/ which makes my life difficult.";

// the url container, you will store the urls found here

$url_tokens = array();

// the callback for the first replace, will take all urls, store them in $url_tokens, then replace them with [[URL::X]] with X being an unique number for each url
//
// note that the closure use $url_token by reference, so that we can add entries to it from inside the function

$callback = function ($matches) use (&$url_tokens) {
  static $token_iteration = 0;

  $token = '[[URL::'.$token_iteration.']]';

  $url_tokens[$token_iteration] = $matches;

  $token_iteration++;

  return $token;
};

// replace our urls with our callback

$pattern = '/(((?:https?|ftps?)\:\/\/)?([a-zA-Z0-9:]*[@])?([a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}|([0-9]+))([a-zA-Z0-9-._?,\'\/\+&%\$#\=~:]+)?)/';

$string = preg_replace_callback($pattern, $callback, $string);

// some debug code to check what we have at this point

var_dump($url_tokens);
var_dump($string);

// you can do changes to the url you found in $url_tokens here

// now we will replace our previous tokens with a specific string, just as an exemple of how to re-replace them when you're done

$callback_2 = function ($matches) use ($url_tokens) {
  $token = $matches[0];
  $token_iteration = $matches[1];

  if (!isset($url_tokens[$token_iteration])) {
    // if we don't know what this token is, leave it untouched
    return $token;
  }

  return '- there was an url to '.$url_tokens[$token_iteration][4].' here -';
};

$string = preg_replace_callback('/\[\[URL::([0-9]+)\]\]/', $callback_2, $string);

var_dump($string);

Which give this result when executed:

// the $url_tokens array after the first preg_replace_callback
array(2) {
  [0]=>
  array(7) {
    [0]=>
    string(110) "http://stackoverflow.com/questions/7619843/php-preg-replace-call-extract-specific-values-for-later-reinsertion"
    [1]=>
    string(110) "http://stackoverflow.com/questions/7619843/php-preg-replace-call-extract-specific-values-for-later-reinsertion"
    [2]=>
    string(7) "http://"
    [3]=>
    string(0) ""
    [4]=>
    string(17) "stackoverflow.com"
    [5]=>
    string(0) ""
    [6]=>
    string(86) "/questions/7619843/php-preg-replace-call-extract-specific-values-for-later-reinsertion"
  }
  [1]=>
  array(7) {
    [0]=>
    string(22) "http://www.google.com/"
    [1]=>
    string(22) "http://www.google.com/"
    [2]=>
    string(7) "http://"
    [3]=>
    string(0) ""
    [4]=>
    string(14) "www.google.com"
    [5]=>
    string(0) ""
    [6]=>
    string(1) "/"
  }
}
// the $string after the first preg_replace_callback
string(85) "my sentence contains URLs to [[URL::0]] and [[URL::1]] which makes my life difficult."

// the $string after the second replace
string(154) "my sentence contains URLs to - there was an url to stackoverflow.com here - and - there was an url to www.google.com here - which makes my life difficult."
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文