如何使用php domdocument检索和修改具有“红色文本”的链接(“ href”在此Wikipedia Infobox页面上?
我正在尝试检索和修改红色文本URL的链接(包括: autonavi,ucweb 和 agtech Holdings Limited )这里:
我的代码允许我通过 domdocument ,所有href
属性/wiki/code> in所有
a
文档的标签或字符串的网页:$ urlsearch = base_path。
str_ireplace
:
libxml_use_internal_errors(true);
$parser = new DOMDocument();
$parser->loadHTMLFile("https://fr.wikipedia.org/wiki/Alibaba_Group");
$get_a_tags = $parser->getElementsByTagName("a");
foreach ($get_a_tags as $get_a_tag) {
if (stripos($get_a_tag->getAttribute('href'), "/wiki/") !== false || stripos($get_a_tag->getAttribute('href'), "#") !== false) {
$get_href_in_a_infobox = $get_a_tag->getAttribute('href');
$term = $get_a_tag->nodeValue;
$urlSearch = BASE_PATH."search.php?term=$term&type=sites";
// var_dump($urlSearch."<br><br>");
$wikipediaInfoboxTable = str_ireplace($get_href_in_a_infobox, $urlSearch, $wikipediaInfoboxTable);
}
}
我的代码正常工作。
但是,问题是,当我重现同一件事以检索其HREF中包含的URL时,字符串/w/index.php?
,甚至redlink = 1
,简单地做:if(stripos($ get_a_tag-&gt; getAttribute('href') ,“ /w/index.php?”)!== false ||($ get_a_tag-&gt; getAttribute('href'),“ redlink = 1”)!== false)
我没有注意到此类更改在上一个CSS类的代码中成功的任何更改,该类别包含 /wiki/wiki/
更改。
如何成功修改链接(HREF
属性)的所有标签都有CSS类new
???
换句话说,我如何像上面的代码一样成功修改包含字符字符串/w/index.php?
的链接(href
属性) redlink = 1
???
我真的需要你的帮助。
I'm trying to retrieve and modify the link of red text URLs (including: AutoNavi, UCWeb and AGTech Holdings Limited) at the infobox level here:
My code below allows me to replace via DomDocument, all the href
attributes containing /wiki/
in all the a
tags of the Document or the Web Page by the character string: $urlSearch = BASE_PATH."search.php?term=$term&type=sites"
with str_ireplace
:
libxml_use_internal_errors(true);
$parser = new DOMDocument();
$parser->loadHTMLFile("https://fr.wikipedia.org/wiki/Alibaba_Group");
$get_a_tags = $parser->getElementsByTagName("a");
foreach ($get_a_tags as $get_a_tag) {
if (stripos($get_a_tag->getAttribute('href'), "/wiki/") !== false || stripos($get_a_tag->getAttribute('href'), "#") !== false) {
$get_href_in_a_infobox = $get_a_tag->getAttribute('href');
$term = $get_a_tag->nodeValue;
$urlSearch = BASE_PATH."search.php?term=$term&type=sites";
// var_dump($urlSearch."<br><br>");
$wikipediaInfoboxTable = str_ireplace($get_href_in_a_infobox, $urlSearch, $wikipediaInfoboxTable);
}
}
My code above works fine.
BUT, the problem is that when I reproduce the same thing to retrieve the URLs containing in their href, the string /w/index.php?
or even redlink=1
by simply doing: if (stripos ($get_a_tag->getAttribute('href'), "/w/index.php?") !== false || stripos($get_a_tag->getAttribute('href'), "redlink=1") !== false)
, I don't notice any changes like this was successful in the previous code for the CSS class containing the word /wiki/
change.
How to successfully modify the link (href
attribute) of all a tags having CSS class new
???
In other words, how can I successfully modify, as I did in my code above, the links (href
attribute) containing the character strings /w/index.php?
and redlink=1
???
I really need your help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
$ wikipediainfoboxtable
实际上包含Infobox的表。如您所说,您会给我什么建议直接在DOM中工作???因此,如何在我的代码案例中替换
href
,而无需使用str_ireplace
???$wikipediaInfoboxTable
actually contains the table of infobox. What suggestions do you give me to work directly in the DOM as you say ???So, how do I replace the
href
in my code case without using thestr_ireplace
???