PHP 从字符串中提取值

发布于 2024-12-25 19:08:47 字数 804 浏览 2 评论 0原文

我正在用 PHP 处理记录，想知道是否有一种有效的方法来提取流派：以下每个记录中的值。类型：可以是字符串中的任何位置。

在下面的字符串中，我需要提取单词“alternative”（最后一个单词）

[media:keywords] => upc:00602527365589,Records,mercury,artist:Neon 
 Trees,Alternative,trees,neon,genre:alternative

在下面的字符串中，我需要提取“Latin / Pop,latino,Pop”

[media:keywords] => genre:Latin / Pop,latino,Pop,upc:00602527341217,artist:Luis 
 Fonsi,luis,universal,Fonsi,Latin

在下面的记录中，我需要提取“other”

[media:keywords] => upc:793018101530,andy,razor,Other,tie,genre:other,artist:Andy 
McKee,McKee,&

在接下来的记录我需要拿出“岩石，漂浮物，废品”

[media:keywords] => and,upc:00602498572061,genre:rock,flotsam,jetsam,artist:Flotsam 
And Jetsam,rock,geffen

我正在为此抓狂（无论如何都剩下什么）。

原文

I'm processing records in PHP and was wondering if there is an efficient method to pull out the genre: values from each of the following records. genre: can be anywhere in the string.

In the following string I need to pull out the word "alternative" (last word)

[media:keywords] => upc:00602527365589,Records,mercury,artist:Neon 
 Trees,Alternative,trees,neon,genre:alternative

In the following string I need to pull out "Latin / Pop,latino,Pop"

[media:keywords] => genre:Latin / Pop,latino,Pop,upc:00602527341217,artist:Luis 
 Fonsi,luis,universal,Fonsi,Latin

In the following record I need to pull out "other"

[media:keywords] => upc:793018101530,andy,razor,Other,tie,genre:other,artist:Andy 
McKee,McKee,&

In the following record I need to pull out "rock,flotsam,jetsam"

[media:keywords] => and,upc:00602498572061,genre:rock,flotsam,jetsam,artist:Flotsam 
And Jetsam,rock,geffen

I'm pulling my hair out on this (what is left anyway).

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

少女的英雄梦 2025-01-01 19:08:47

将以下正则表达式与 preg_match() 结合使用

~\bgenre:(.+?)(?=(,[^:,]+:|$))~

：所需的结果将位于 matches 数组的第一个元素中（参数 3）。

Use the following regular expression coupled with preg_match():

~\bgenre:(.+?)(?=(,[^:,]+:|$))~

Your desired result will be in the first element of the matches array (paremeter 3).

回复收藏 0 原文

任性一次 2025-01-01 19:08:47

我将使用 strpos 来定义流派的开始位置。您遇到的唯一问题是在哪里结束它，因为您没有分隔符。我应该使用已知的其他关键字，如“upc”、“artist”等来检查字符串是否需要在末尾被剪切。

回复收藏 0 原文

二手情话 2025-01-01 19:08:47

您确实可以使用一些模式检测。您总是在寻找固定的 genre: 后跟一个或多个单词或短语，它们本身都不能包含 :

所以这可能就足够了：

preg_match('~\bgenre:(,?[^:,]+(?=,|$))+~', $media_keywords, $match);
print $match[1];

You can indeed use a bit of pattern detection. You are always looking for the fixed genre: followed by one or more words or phrases, neither of which may itself contain a :

So this might suffice:

preg_match('~\bgenre:(,?[^:,]+(?=,|$))+~', $media_keywords, $match);
print $match[1];

回复收藏 0 原文

地狱即天堂 2025-01-01 19:08:47

$mystring = 'abc';
$findme   = 'a';
$pos = strpos($mystring, $findme);

// Note our use of ===.  Simply == would not work as expected
// because the position of 'a' was the 0th (first) character.
if ($pos === false) {
    echo "The string '$findme' was not found in the string '$mystring'";
} else {
    echo "The string '$findme' was found in the string '$mystring'";
    echo " and exists at position $pos";
}

来自 strpos 的 PHP 文档

所以你可以只使用 $findme = “替代方案”

$mystring = 'abc';
$findme   = 'a';
$pos = strpos($mystring, $findme);

// Note our use of ===.  Simply == would not work as expected
// because the position of 'a' was the 0th (first) character.
if ($pos === false) {
    echo "The string '$findme' was not found in the string '$mystring'";
} else {
    echo "The string '$findme' was found in the string '$mystring'";
    echo " and exists at position $pos";
}

From the PHP Documentation for strpos

So you can just use $findme = "alternative"

回复收藏 0 原文

瞄了个咪的 2025-01-01 19:08:47

解析此字符串的问题是您没有正常的分隔符和/或引号（即逗号分隔字段，但也可能包含在字段中 - 这与不带引号的 CSV 文件存在相同的问题）。

如果性能对你来说并不重要，我建议以更防弹的方式解析它，比如对什么是关键（如艺术家、流派、ups 等）做出一些假设，并引入一些正常的分隔符、概念证明代码是：（我留下了回声，这样你就可以看到发生了什么）

$string = "genre:Latin / Pop,latino,Pop,upc:00602527341217,artist:Luis Fonsi,luis,universal,Fonsi,Latin";
//introduce a delimiter
$delimiter = '|';
$withDelimiter = preg_replace('/([a-z]+):/', $delimiter . '$0', $string);
echo $withDelimiter . "\n";

$fields = explode($delimiter, $withDelimiter);
foreach ($fields as $field) {
    if (strlen($field)) {
        echo $field . "\n";

        list ($key, $valueWithPossiblyTrailingComma) = explode(':', $field);    

        if ($key === 'genre') {
            $genre = rtrim($valueWithPossiblyTrailingComma, ',');
            break;
        }
    }
}
echo $genre;

你可以让它在几乎所有情况下工作，它不仅可以让你找到任何类型的关键 - 但它的性能会很低。

我对你的字符串做了以下假设：

它是一个 key => 的列表。由冒号分隔并与逗号键连接的值
对可能仅包含 [az] 字符

your problem with parsing this string is that you don't have normal delimiter and/or quotes (i.e. comma separates fields, but may be as well included in a field - it's the same problem that exist with CSV files without quotes).

If performance does not matter a lot for you I would suggest parsing it in more bullet proof way, like make some assumption about what is a key (like artist, genre, ups, etc.) and introduce some normal delimiter, the proof of concept code would be: (i have left echoes so you can see whats happening)

$string = "genre:Latin / Pop,latino,Pop,upc:00602527341217,artist:Luis Fonsi,luis,universal,Fonsi,Latin";
//introduce a delimiter
$delimiter = '|';
$withDelimiter = preg_replace('/([a-z]+):/', $delimiter . '$0', $string);
echo $withDelimiter . "\n";

$fields = explode($delimiter, $withDelimiter);
foreach ($fields as $field) {
    if (strlen($field)) {
        echo $field . "\n";

        list ($key, $valueWithPossiblyTrailingComma) = explode(':', $field);    

        if ($key === 'genre') {
            $genre = rtrim($valueWithPossiblyTrailingComma, ',');
            break;
        }
    }
}
echo $genre;

you can make it work in nearly all cases, and it allows you to find any key not only genre - but it's performance will be low.

I have made following assumptions about your string: