php中如何提取字符串的一部分
我正在使用 preg_replace() 进行一些字符串替换。
$str = "<aa>Let's find the stuff qwe in between <id>12345</id> these two previous brackets</h>";
$do = preg_match("/qwe(.*)12345/", $str, $matches);
它工作得很好,并给出了以下结果
$match[0]=qwe in between 12345 $match[1]=in between
,但我使用相同的逻辑从以下字符串中提取。
<text>
<src><![CDATA[<TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Arial" SIZE="36" COLOR="#999999" LETTERSPACING="0" KERNING="0">r1 text 1 </FONT></P></TEXTFORMAT>]]></src>
<width>45%</width>
<height>12%</height>
<left>30.416666666666668%</left>
<top>3.0416666666666665%</top>
<begin>2s</begin>
<dur>10s</dur>
<transIn>fadeIn</transIn>
<transOut>fadeOut</transOut>
<id>E2159292994B083ACA7ABC7799BBEF3F7198FFA2</id>
</text>
我想从 中提取字符串
r1text1
我
</id>
当前拥有的正则表达式是:
preg_match('/r1text1(.*)</id\>/', $metadata], $matches);
其中 $metadata 是上面的字符串..
$matches 不返回任何内容.... 出于某种原因...我该怎么做? 提前致谢
I am using preg_replace() for some string replacement.
$str = "<aa>Let's find the stuff qwe in between <id>12345</id> these two previous brackets</h>";
$do = preg_match("/qwe(.*)12345/", $str, $matches);
which is working just fine and gives the following result
$match[0]=qwe in between 12345 $match[1]=in between
but I am using same logic to extract from the following string.
<text>
<src><![CDATA[<TEXTFORMAT LEADING="2"><P ALIGN="LEFT"><FONT FACE="Arial" SIZE="36" COLOR="#999999" LETTERSPACING="0" KERNING="0">r1 text 1 </FONT></P></TEXTFORMAT>]]></src>
<width>45%</width>
<height>12%</height>
<left>30.416666666666668%</left>
<top>3.0416666666666665%</top>
<begin>2s</begin>
<dur>10s</dur>
<transIn>fadeIn</transIn>
<transOut>fadeOut</transOut>
<id>E2159292994B083ACA7ABC7799BBEF3F7198FFA2</id>
</text>
I want to extract the string from
r1text1
to
</id>
The Regular expression I currently Have is:
preg_match('/r1text1(.*)</id\>/', $metadata], $matches);
where $metadata is the above string..
$matches does not return anything....
For some reason...how do i do it?
Thanks in advance
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
如果您想提取文本,您可能需要使用
preg_match
。 以下可能有效:括号中匹配的任何内容都可以稍后在
$matches
数组中找到。 在这种情况下,标记后跟
标记和
之间的所有内容(包括后者) 。
上面的正则表达式未经测试,但可能会让您大致了解如何做到这一点。 如果您的需求有点不同,请适应:)
If you want to extract the text, you will probably want to use
preg_match
. The following might work:Whatever gets matched in the parantheses can be found later in the
$matches
array. In this case everything between a<P>
tag followed by a<FONT>
tag and</id>
, including the latter.Above regex is untested but might give you a general idea of how to do it. Adapt if your needs are a bit different :)
即使不知道为什么要在不完整的 XML 片段上匹配正则表达式(从
开始并在结束 XML 标记
之前结束),
,您的正则表达式确实存在三个明显的问题:As Amri 说:您必须转义结束 XML 标记中的
/
字符,因为您使用/
作为顺便说一句,您不必转义>
字符,这会为您提供:'/r1text1(.*)<\/id>/' 或者,您可以将模式分隔符更改为
#
,例如:'#r1text1(.*)#'
(我将使用第一个模式进一步开发表达式)。作为里奇·亚当斯 已经说过:示例数据中的文本是“
r1_text_1
”(_
是空格字符),但您与'/r1text1(.*) 匹配<\/id>/'
。 您必须在正则表达式中包含空格或允许不确定数量的空格,例如'/r1(?:\s*)text(?:\s*)1(.*)<\/ id>/'
(?:
是非捕获子模式的语法)正则表达式中的
.
(点)与换行符不匹配默认情况下。 您必须添加s
(PCRE_DOTALL) 模式修饰符才能让.
(点)也与换行符匹配:'/r1(?:\s* )text(?:\s*)1(.*)<\/id>/s'
Even if don't know why you would match the regex on a incomplete XML fragment (starting within a
<![CDATA[
and ending right before the closing XML tag</id>
, you do have three obvious problems with your regex:As Amri said: you have to escape the
/
character in the closing XML tag because you use/
as the pattern delimiter. By the way, you don't have to escape the>
character. That gives you:'/r1text1(.*)<\/id>/'
Alternatively you can change the pattern delimiter to#
for example:'#r1text1(.*)</id>#'
(I will use the first pattern to further develop the expression).As Rich Adams already said: the text in your example data is "
r1_text_1
" (_
is a space character) but you match against'/r1text1(.*)<\/id>/'
. You have to include the spaces in your regex or allow for a uncertain number of spaces, such as'/r1(?:\s*)text(?:\s*)1(.*)<\/id>/'
(the?:
is the syntax for non-capturing subpatterns)The
.
(dot) in your regex does not match newlines by default. You have to add thes
(PCRE_DOTALL) pattern modifier to let the.
(dot) match against newlines as well:'/r1(?:\s*)text(?:\s*)1(.*)<\/id>/s'
您可能需要解析字符串/文件并提取 FONT 标记之间的值。 然后将值插入到 id 标签中
尝试谷歌搜索 php 解析。
you probably need to parse your string/file and extract the value between the FONT tag. Then insert the value into the id tag
Try googling for php parsing.
试试这个
您正在使用 / 作为模式分隔符,但您的内容中有 / 。 您可以使用 \ 作为转义字符。
try this
You are using / as the pattern delimiter but your content has / in . You can use \ as the escape character.
在示例中,您有“r1 text 1 ”,但您的正则表达式有“r1text1”。 正则表达式不匹配,因为您尝试匹配的字符串中存在空格。 您应该在正则表达式中包含空格。
In the sample you have "r1 text 1 ", yet your regular expression has "r1text1". The regular expression doesn't match because there are spaces in the string you are trying to match it against. You should include the spaces in the regular expression.