PHP 从字符串中提取值
我正在用 PHP 处理记录,想知道是否有一种有效的方法来提取流派:以下每个记录中的值。类型:可以是字符串中的任何位置。
在下面的字符串中,我需要提取单词“alternative”(最后一个单词)
[media:keywords] => upc:00602527365589,Records,mercury,artist:Neon
Trees,Alternative,trees,neon,genre:alternative
在下面的字符串中,我需要提取“Latin / Pop,latino,Pop”
[media:keywords] => genre:Latin / Pop,latino,Pop,upc:00602527341217,artist:Luis
Fonsi,luis,universal,Fonsi,Latin
在下面的记录中,我需要提取“other”
[media:keywords] => upc:793018101530,andy,razor,Other,tie,genre:other,artist:Andy
McKee,McKee,&
在接下来的记录我需要拿出“岩石,漂浮物,废品”
[media:keywords] => and,upc:00602498572061,genre:rock,flotsam,jetsam,artist:Flotsam
And Jetsam,rock,geffen
我正在为此抓狂(无论如何都剩下什么)。
I'm processing records in PHP and was wondering if there is an efficient method to pull out the genre: values from each of the following records. genre: can be anywhere in the string.
In the following string I need to pull out the word "alternative" (last word)
[media:keywords] => upc:00602527365589,Records,mercury,artist:Neon
Trees,Alternative,trees,neon,genre:alternative
In the following string I need to pull out "Latin / Pop,latino,Pop"
[media:keywords] => genre:Latin / Pop,latino,Pop,upc:00602527341217,artist:Luis
Fonsi,luis,universal,Fonsi,Latin
In the following record I need to pull out "other"
[media:keywords] => upc:793018101530,andy,razor,Other,tie,genre:other,artist:Andy
McKee,McKee,&
In the following record I need to pull out "rock,flotsam,jetsam"
[media:keywords] => and,upc:00602498572061,genre:rock,flotsam,jetsam,artist:Flotsam
And Jetsam,rock,geffen
I'm pulling my hair out on this (what is left anyway).
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
将以下正则表达式与 preg_match() 结合使用
:所需的结果将位于 matches 数组的第一个元素中(参数 3)。
Use the following regular expression coupled with preg_match():
Your desired result will be in the first element of the matches array (paremeter 3).
我将使用 strpos 来定义流派的开始位置。您遇到的唯一问题是在哪里结束它,因为您没有分隔符。我应该使用已知的其他关键字,如“upc”、“artist”等来检查字符串是否需要在末尾被剪切。
I shall use a strpos to define where the genre starts. The only problem you have is where to end it because you do not have a delimeter. I should use the known other keywords like "upc","artist" etc to check if the string needs to be cut of at the end.
您确实可以使用一些模式检测。您总是在寻找固定的
genre:
后跟一个或多个单词或短语,它们本身都不能包含:
所以这可能就足够了:
You can indeed use a bit of pattern detection. You are always looking for the fixed
genre:
followed by one or more words or phrases, neither of which may itself contain a:
So this might suffice:
来自 strpos 的 PHP 文档
所以你可以只使用
$findme = “替代方案”
From the PHP Documentation for strpos
So you can just use
$findme = "alternative"
解析此字符串的问题是您没有正常的分隔符和/或引号(即逗号分隔字段,但也可能包含在字段中 - 这与不带引号的 CSV 文件存在相同的问题)。
如果性能对你来说并不重要,我建议以更防弹的方式解析它,比如对什么是关键(如艺术家、流派、ups 等)做出一些假设,并引入一些正常的分隔符、概念证明代码是:(我留下了回声,这样你就可以看到发生了什么)
你可以让它在几乎所有情况下工作,它不仅可以让你找到任何类型的关键 - 但它的性能会很低。
我对你的字符串做了以下假设:
your problem with parsing this string is that you don't have normal delimiter and/or quotes (i.e. comma separates fields, but may be as well included in a field - it's the same problem that exist with CSV files without quotes).
If performance does not matter a lot for you I would suggest parsing it in more bullet proof way, like make some assumption about what is a key (like artist, genre, ups, etc.) and introduce some normal delimiter, the proof of concept code would be: (i have left echoes so you can see whats happening)
you can make it work in nearly all cases, and it allows you to find any key not only genre - but it's performance will be low.
I have made following assumptions about your string: