使用 DOM 和 xpath 设置无样式链接的样式

发布于 2024-11-16 10:25:12 字数 683 浏览 0 评论 0原文

对于我正在构建的系统,我定义了一个存储在 LINKSTYLE 中的通用 style ,它应该应用于尚未设置样式的 a 元素(内联) )。我对 DOMDocument 或 xpath 不太有经验,我不知道出了什么问题。

感谢戈登,我更新了我的代码:

libxml_use_internal_errors(true);    

$html  = '<a href="#">test</a>'.
         '<a href="#" style="border:1px solid #000;">test2</a>';

$dom    = new DOMDocument();
$dom->loadHtml($html);
$dom->normalizeDocument();  
$xpath = new DOMXPath($dom);

foreach($xpath->query('//a[not(@style)]') as $node)
    $node->setAttribute('style','border:1px solid #000');

return $html;

使用这个更新代码,我不再收到错误,但是 a 元素没有设置样式。

For a system I am building I am defining a general style stored in LINKSTYLE that should be applied to a elements that are not yet styled (inline). I am not very experienced with the DOMDocument or xpath and I can't figure out what is going wrong.

Thanks to Gordon I've updated my code:

libxml_use_internal_errors(true);    

$html  = '<a href="#">test</a>'.
         '<a href="#" style="border:1px solid #000;">test2</a>';

$dom    = new DOMDocument();
$dom->loadHtml($html);
$dom->normalizeDocument();  
$xpath = new DOMXPath($dom);

foreach($xpath->query('//a[not(@style)]') as $node)
    $node->setAttribute('style','border:1px solid #000');

return $html;

With this updated code I receive no more errors, however the a element does not get styled.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

何时共饮酒 2024-11-23 10:25:12

使用 libxml_use_internal_errors(true) 来抑制由 loadHTML 产生的解析错误。

XPath 查询无效,因为 contains 需要在 style 属性中搜索一个值。

如果您想查找没有样式元素的所有锚点,只需使用

//a[not(@style)]

您没有看到您的更改,因为您正在返回存储在 $html 中的字符串。使用 DOMDocument 加载字符串后,必须在运行查询并修改 DOMDocument 该字符串的内部表示后将其序列化回来。

示例(演示

$html = <<< HTML
<ul>
    <li><a href="#foo" style="font-weight:bold">foo</a></li>
    <li><a href="#bar">bar</a></li>
    <li><a href="#baz">baz</a></li>
</ul>
HTML;
$dom = new DOMDocument;
$dom->loadHTML($html);
$xp = new DOMXpath($dom);
foreach ($xp->query('//a[not(@style)]') as $node) {
    $node->setAttribute('style', 'font-weight:bold');
}
echo $dom->saveHTML($dom->getElementsByTagName('ul')->item(0));

输出:

<ul>
<li><a href="#foo" style="font-weight:bold">foo</a></li>
    <li><a href="#bar" style="font-weight:bold">bar</a></li>
    <li><a href="#baz" style="font-weight:bold">baz</a></li>
</ul>

注意为了将 saveHTML 与参数一起使用,你至少需要PHP 5.3.6。

Use libxml_use_internal_errors(true) to suppress parsing errors stemming from loadHTML.

The XPath query is invalid because contains expects a value to search for in the style attribute.

If you want to find all anchors without a style element, just use

//a[not(@style)]

You are not seeing your changes, because you are returning the string stored in $html. Once you loaded the string with DOMDocument, you have to serialize it back after you have have run your query and modified the DOMDocument's internal representation of that string.

Example (demo)

$html = <<< HTML
<ul>
    <li><a href="#foo" style="font-weight:bold">foo</a></li>
    <li><a href="#bar">bar</a></li>
    <li><a href="#baz">baz</a></li>
</ul>
HTML;
$dom = new DOMDocument;
$dom->loadHTML($html);
$xp = new DOMXpath($dom);
foreach ($xp->query('//a[not(@style)]') as $node) {
    $node->setAttribute('style', 'font-weight:bold');
}
echo $dom->saveHTML($dom->getElementsByTagName('ul')->item(0));

Output:

<ul>
<li><a href="#foo" style="font-weight:bold">foo</a></li>
    <li><a href="#bar" style="font-weight:bold">bar</a></li>
    <li><a href="#baz" style="font-weight:bold">baz</a></li>
</ul>

Note that in order to use saveHTML with an argument, you need at least PHP 5.3.6.

唱一曲作罢 2024-11-23 10:25:12

当您将文档内部 & 用于创建实体引用以外的其他目的(例如 ")时,会发生第一个错误(编辑之前)。

当您分隔 GET 参数时,通常会在 URL 中发生这种情况。

您可以使用 Gordon 的建议忽略此错误或修复它(用 & 替换 & 的出现)。

The first error (before editing) occurs when you use inside document a & for other purposes than creating a entity-reference (e.g. ").

Usually this happens in URLs when you delimit GET-parameters.

You can ignore this errors using Gordon's suggestion or fix it(replace occurences of & by &).

梦回梦里 2024-11-23 10:25:12

我想知道是否可以更明智地解决这个问题,例如使用选择器。在 CSS3 中,只能处理那些没有 style 属性的 标记:

a:not([style]) {border:1px solid #000;}

因此,如果您的文档已经有样式表,则可以轻松添加它。

如果没有,则必须将

libxml_use_internal_errors(true);    

$html  = '<a href="#">test</a>'.
         '<a href="#" style="border:1px solid #000;">test2</a>';

$dom = new DOMDocument();
$dom->loadHtml($html);
$dom->normalizeDocument();

// ensure that there is a head element, body will always be there
// because of loadHtml();
$head = $dom->getElementsByTagName('head');
if (0 == $head->length) {
    $head = $dom->createElement('head');
    $body = $dom->getElementsByTagName('body')->item(0);
    $head = $body->parentNode->insertBefore($head, $body);
} else {
    $head=$head->item(0);
}

// append style tag to head.
$css = 'a:not([style]) {border:1px solid #000;}';
$style = $dom->createElement('style');
$style->nodeValue=$css;
$head->appendChild($style);

$dom->formatOutput = true;
$output = $dom->saveHtml();

echo $output;

示例输出:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head><style>a:not([style]) {border:1px solid #000;}</style></head>
<body>
<a href="#">test</a><a href="#" style="border:1px solid #000;">test2</a>
</body>
</html>

如果 CSS 与其他更高的选择器发生冲突,这不是一个简单的解决方案。 !important 可能会有所帮助。

HTML 片段

至于获取更改后的 HTML 片段,这是一些可以与 gordons 建议一起使用的附加代码。只是body标签的inner-html,这次我玩了一下SPL:

// get html fragment
$output = implode('', array_map(
  function($node) use ($dom) { return $dom->saveXml($node); },
  iterator_to_array($xpath->query('//body/*'), false)))
  ;

foreach肯定更具可读性和内存友好性:

// get html fragment
$output = '';
foreach($xpath->query('//body/*') as $node) 
  $output .= $dom->saveXml($node)
  ;

I was wondering if it's possible to solve this more CCS-wise, e.g. with a selector. In CSS3 it's possible to only address those <a> tags that don't have the style attribute:

a:not([style]) {border:1px solid #000;}

So if your documents already have a stylesheet it could be easily added.

If not, then a <style> must be added to the document. This can be done with DomDocument as well but I found it a bit complicated. However I got it to work for some little play:

libxml_use_internal_errors(true);    

$html  = '<a href="#">test</a>'.
         '<a href="#" style="border:1px solid #000;">test2</a>';

$dom = new DOMDocument();
$dom->loadHtml($html);
$dom->normalizeDocument();

// ensure that there is a head element, body will always be there
// because of loadHtml();
$head = $dom->getElementsByTagName('head');
if (0 == $head->length) {
    $head = $dom->createElement('head');
    $body = $dom->getElementsByTagName('body')->item(0);
    $head = $body->parentNode->insertBefore($head, $body);
} else {
    $head=$head->item(0);
}

// append style tag to head.
$css = 'a:not([style]) {border:1px solid #000;}';
$style = $dom->createElement('style');
$style->nodeValue=$css;
$head->appendChild($style);

$dom->formatOutput = true;
$output = $dom->saveHtml();

echo $output;

Example output:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head><style>a:not([style]) {border:1px solid #000;}</style></head>
<body>
<a href="#">test</a><a href="#" style="border:1px solid #000;">test2</a>
</body>
</html>

If the CSS clashes with other, higher selectors, this is not an easy solution. !important might help though.

HTML Fragment

And as far of getting the changed HTML fragment, this is some additional code that can work with gordons suggestion. Just the inner-html of the body tag, this time I played a bit with the SPL:

// get html fragment
$output = implode('', array_map(
  function($node) use ($dom) { return $dom->saveXml($node); },
  iterator_to_array($xpath->query('//body/*'), false)))
  ;

A foreach is definitely more readable and memory friendly:

// get html fragment
$output = '';
foreach($xpath->query('//body/*') as $node) 
  $output .= $dom->saveXml($node)
  ;
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文