strpos 删除文本/html

发布于 2024-12-02 08:11:56 字数 2123 浏览 0 评论 0原文

我正在解析一个 XML 文件,它的创建者陷入了一堆对我来说完全无用的社交媒体信息中。我想在将数据插入数据库之前删除它。

问题是它并不完全相同,有些情况是:

成为社交蝴蝶!连接&了解更多信息如下: 网站 • Facebook • Yelp

有些列出的社交网站较多,有些则较少。我真的很想删除整个部分。这也是运行 strip_tags 后的 vardump。原文如下:

<strong>Be a Social Butterfly! Connect & Learn More Below:</br></strong>
<a target="_blank" href="http://www.kiran-indian.com">Website</a> •<a target="_blank" href="http://www.facebook.com/pages/Kiran-Indian-Cuisine/55785994435"> Facebook</a> • <a target="_blank" href="http://www.yelp.com/biz/kiran-indian-cuisine-new-york">Yelp</a>

我使用 preg_replace 删除了整个句子“成为社交蝴蝶......”,

$description = strip_tags(preg_replace('/\bBe a Social Butterfly! Connect & Learn More Below\b/', '', $value['redemptionLocations']['description']));

我的一个朋友建议使用 strpos 来查找第一个/最后一个部分,并使用 substr 来删除中间的所有内容,但遗憾的是我还不够先进,无法弄清楚如何做到这一点。

提前致谢!

描述字段:

       
Food always does one thing. It helps keep you alive. But it can do more. It can be an experience that educates, transports, and invigorates you. Lunch or dinner at <a target="_blank" href="http://www.kiran-indian.com/home.htmls">Kiran Indian Cuisine</a> a lot more than a chance to keep from starving for another day --- it’s a chance to depart from the norm with delicious homemade dishes using the freshest of ingredients and the most aromatic seasoning available. They are open 7 days a week from 11 a.m. to 11 p.m. and accept all the major credit cards, plus when you order online from the surrounding area, delivery is 100% free of charge.</br></br>

<strong>Be a Social Butterfly! Connect & Learn More Below:</br></strong>
<a target="_blank" href="http://www.kiran-indian.com">Website</a> •<a target="_blank" href="http://www.facebook.com/pages/Kiran-Indian-Cuisine/55785994435"> Facebook</a> •  <a target="_blank" href="http://www.yelp.com/biz/kiran-indian-cuisine-new-york">Yelp</a>

似乎将该代码粘贴到此处会自动调整 asci/etc。

I'm parsing an XML file, the creators of it stuck in a bunch social media info which is completely useless to me. I'd like to remove it before inserting the data into the db.

Problem is that its not all the same, some occurrences are :

Be a Social Butterfly! Connect & Learn More Below:
Website • Facebook • Yelp

Some have more social sites listed and some have less. Id really like to remove that entire part. also this is a vardump after running strip_tags. The original looks like this:

<strong>Be a Social Butterfly! Connect & Learn More Below:</br></strong>
<a target="_blank" href="http://www.kiran-indian.com">Website</a> •<a target="_blank" href="http://www.facebook.com/pages/Kiran-Indian-Cuisine/55785994435"> Facebook</a> • <a target="_blank" href="http://www.yelp.com/biz/kiran-indian-cuisine-new-york">Yelp</a>

I used preg_replace to get rid of th entire sentence "be a social butterfly...." with

$description = strip_tags(preg_replace('/\bBe a Social Butterfly! Connect & Learn More Below\b/', '', $value['redemptionLocations']['description']));

A buddy of mine suggested the use of strpos to find first/last parts and substr to remove everything in between, but sadly I am not advanced enough to figure out how to do that.

Thanks in advance!

description field:

       
Food always does one thing. It helps keep you alive. But it can do more. It can be an experience that educates, transports, and invigorates you. Lunch or dinner at <a target="_blank" href="http://www.kiran-indian.com/home.htmls">Kiran Indian Cuisine</a> a lot more than a chance to keep from starving for another day --- it’s a chance to depart from the norm with delicious homemade dishes using the freshest of ingredients and the most aromatic seasoning available. They are open 7 days a week from 11 a.m. to 11 p.m. and accept all the major credit cards, plus when you order online from the surrounding area, delivery is 100% free of charge.</br></br>

<strong>Be a Social Butterfly! Connect & Learn More Below:</br></strong>
<a target="_blank" href="http://www.kiran-indian.com">Website</a> •<a target="_blank" href="http://www.facebook.com/pages/Kiran-Indian-Cuisine/55785994435"> Facebook</a> •  <a target="_blank" href="http://www.yelp.com/biz/kiran-indian-cuisine-new-york">Yelp</a>

seems pasting that code into here automatically adjusts asci/etc.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

怀里藏娇 2024-12-09 08:11:56

您需要找到整个文本中第一个字符串的位置,使用 strpos 来实现,然后您需要找到要删除的块末尾的位置,再次使用 strpos。现在您已经有了要删除的块的起点和终点,请使用 substr_replace 将其替换为空 ''substr_replace 将要删除的块的长度作为第四个参数,而不是像第三个参数那样的位置,因此您需要从第二个位置 int 中减去第一个位置 int 来计算长度。

$feedtext='<description> this part is important...  be a social butterfly .. blah blah etc etc whatever whatever </description>';

$pos1=strpos($feedtext,'be a social butterfly');
$pos2=strpos($feedtext,'</description>');
$len=$pos2-$pos1;
$newtext=substr_replace($feedtext,'',$pos1,$len);

echo $newtext;

测试:http://www.ideone.com/1X5gI

You need to find the position of the first string in the whole text, use strpos for that, then you need to find the position at the end of the chunk you want to remove, again use strpos. Now you have the beginning and end point of the chunk you want to remove, use substr_replace to replace it with nothing ''. substr_replace takes the length of the chunk to remove as the 4th parameter, rather than the position as with the 3rd parameter, so you need to subtract the 1st position int from the 2nd position int to figure out the length.

$feedtext='<description> this part is important...  be a social butterfly .. blah blah etc etc whatever whatever </description>';

$pos1=strpos($feedtext,'be a social butterfly');
$pos2=strpos($feedtext,'</description>');
$len=$pos2-$pos1;
$newtext=substr_replace($feedtext,'',$pos1,$len);

echo $newtext;

tested: http://www.ideone.com/1X5gI

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文