将字符串中的单词拆分为数组,而不破坏用双引号括起来的短语

发布于 2024-08-15 14:58:18 字数 123 浏览 12 评论 0原文

我想让用户输入标签:windows linux "mac os x"

,然后用空格将它们分开,但也将“mac os x”识别为整个单词。

是否可以将爆炸功能与其他功能结合起来?

I want to let the user type in tags: windows linux "mac os x"

and then split them up by white space but also recognizing "mac os x" as a whole word.

Is this possible to combine the explode function with other functions for this?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

倦话 2024-08-22 14:58:18

我会要求用户输入以逗号分隔的标签,并用逗号分隔符分解:

$string = "windows, linux, mac os x";
$pieces = explode(',', $string);

这是大多数标签系统的工作方式。

否则你需要构造一个解析器,因为explode无法满足你想要的。在我看来,正则表达式是一种矫枉过正的做法。

I would ask the user to enter the tags commas separated and explode with comma delimiter:

$string = "windows, linux, mac os x";
$pieces = explode(',', $string);

This is they way most tag system work anyway.

otherwise you'll need to construct a parser because explode cannot cope with what you want. Regex is an overkill in my opinion.

无言温柔 2024-08-22 14:58:18

只要引号内不能有引号(例如不允许使用 "foo\"bar"),您就可以使用正则表达式来完成此操作。否则您需要一个完整的解析器。

这应该做:

function split_words($input) {
  $matches = array();
  if (preg_match_all('/("([^"]+)")|(\w+)/', $input, $reg)) {
    for ($ii=0,$cc=count($reg[0]); $ii < $cc; ++$ii) {
      $matches[] = $reg[2][$ii] ? $reg[2][$ii] : $reg[3][$ii];
    }
  }
  return $matches;
}

用法:

$input = 'windows linux "mac os x"';
var_dump(split_words($input));

As long as there can't be quotes within quotes (eg. "foo\"bar" isn't allowed), you can do this with a regular expression. Otherwise you need a full parser.

This should do:

function split_words($input) {
  $matches = array();
  if (preg_match_all('/("([^"]+)")|(\w+)/', $input, $reg)) {
    for ($ii=0,$cc=count($reg[0]); $ii < $cc; ++$ii) {
      $matches[] = $reg[2][$ii] ? $reg[2][$ii] : $reg[3][$ii];
    }
  }
  return $matches;
}

Usage:

$input = 'windows linux "mac os x"';
var_dump(split_words($input));
始终不够 2024-08-22 14:58:18

要么让用户按照 Elzo Valugi 建议用逗号分隔标签值,要么改进您的 UI,以便用户一次输入一个标签(类似于 Google Wave 或 Wordpress 的标签 UI)。我建议后者。

如果您确实想坚持使用建议的条目格式(我不建议这样做),您可以维护一个多单词标签列表(那些不应该拆分的标签)。将用户提供的组合标签字符串与此列表进行比较,并确保您没有拆分这些术语。如果您决定坚持这种方法,我可以详细说明,但我认为这不是一个好主意,因为条目格式本身就有缺陷。

Either have the user separate their tag values with commas as Elzo Valugi suggested, or improve on your UI so that users enter one tag at a time (similar to Google Wave or Wordpress's tagging UI). I suggest the later.

If you really want to stick with your proposed entry format (which I don't suggest), you could maintain a list of multi-word tags (those that aren't supposed to be split). Compare the combined tag string provided by the user against this list and make sure that you don't split those terms. If you're set on sticking to this method, I could go into the details more, but I don't think it's a good idea as the entry format itself is flawed.

人心善变 2024-08-22 14:58:18

你可以做一个正则表达式。我不是最擅长写它们,但这里的其他人应该能够匹配在不在引号中的空格上打破它们的“单词”。

You could do a regex. I'm not the best at writing them, but someone else here should be able to match the 'words' breaking them on spaces that aren't in quotes.

淡墨 2024-08-22 14:58:18

当用户输入字符串“mac os x”时,您可以自动检测空格并将字符串更改为“mac-os-x”,然后您仍然可以爆炸这样:

$os = "metasys solaris mac-os-x";
$strings = explode(' ', $os);

您可以使用替换功能来完成此操作。

When the user is entering the string "mac os x" you can automatically detect the white space and change to string to "mac-os-x" then you can still explode this way:

$os = "metasys solaris mac-os-x";
$strings = explode(' ', $os);

You can do this using the replace function.

甜中书 2024-08-22 14:58:18

您正在解析一个分隔字符串——本例中的分隔符是一个空格。

PHP 有 str_getcsv() ,它将保护用特定字符包裹的子字符串——默认的包裹字符是双引号(对你来说多么方便)。如果您的输入字符串以逗号分隔,则可以省略第二个参数,因为这是默认值。

双引号将从结果数组中的值中去除。

代码:(演示)

$string = 'windows linux "mac os x"';

var_export(
    str_getcsv($string, ' ')
);

输出:

array (
  0 => 'windows',
  1 => 'linux',
  2 => 'mac os x',
)

You are parsing a delimited string -- that delimiter in this case is a space.

PHP has str_getcsv() which will protect substrings wrapped in a particular character -- the default wrapping character is a double quote (how convenient for you). If your input string was comma-delimited, you could omit the 2nd parameter because that is the default value.

The double quotes will be stripped from the value in the result array.

Code: (Demo)

$string = 'windows linux "mac os x"';

var_export(
    str_getcsv($string, ' ')
);

Output:

array (
  0 => 'windows',
  1 => 'linux',
  2 => 'mac os x',
)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文