CSS 选择器:选择(父|子)与 X 不匹配的元素

发布于 2024-10-07 05:27:41 字数 361 浏览 5 评论 0原文

我想选择一个没有特定类型子元素的元素,例如:

所有没有 < 的

  • 元素/code> 子元素,我只想选择父元素,而不是与表不匹配的子元素。
  • 同样,我想匹配其父级与 X 不匹配的元素,例如: 所有不是 后代的

  • 元素。
  • 我正在使用 python 和 lxml 的 cssselect。

    谢谢!

    I'd like to select an element which has no children of a specific type, for example:

    all <li> elements who have no <table class="someclass"> children, I'd like to select only the parent element, not the children that don't match table.

    On a similar note, I'd like to match elements whose parents don't match X, for example:
    all <li> elements who are not descendents of <table class="someclass">.

    I'm using python, and lxml's cssselect.

    Thanks!

    如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

    扫码二维码加入Web技术交流群

    发布评论

    需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

    评论(2

    黒涩兲箜 2024-10-14 05:27:41

    CSS3 :not 选择器< /a> 会让你部分到达那里。不幸的是,没有父选择器,因此您无法根据其子元素的特征来选择元素。

    对于你的第一个问题,你必须明确地进行遍历:

    # All <li> elements who have no <table class="someclass"> children
    [e.getparent() for e in CSSSelector('li > table:not(.someclass)')(html)]
    
    # To make it unique if there could be multiple acceptable child tables
    set(e.getparent() for e in CSSSelector('li > table:not(.someclass)')(html))
    
    # If there could be empty <li>
    set(itertools.chain(
        (e.getparent() for e in CSSSelector('li > table:not(.someclass)')(html)),
        CSSSelector('li:empty')(html)
    ))
    

    CSS选择器本身就可以处理你的第二个问题:

    # All <li> elements who are not descendents of <table class="someclass">
    CSSSelector(':not(table.someclass) li')(html)
    

    The CSS3 :not selector will get you partly there. Unfortunately, there is no parent selector so you can't select an element based on characteristics of its children.

    For your first question you have to explicitly do the traversal:

    # All <li> elements who have no <table class="someclass"> children
    [e.getparent() for e in CSSSelector('li > table:not(.someclass)')(html)]
    
    # To make it unique if there could be multiple acceptable child tables
    set(e.getparent() for e in CSSSelector('li > table:not(.someclass)')(html))
    
    # If there could be empty <li>
    set(itertools.chain(
        (e.getparent() for e in CSSSelector('li > table:not(.someclass)')(html)),
        CSSSelector('li:empty')(html)
    ))
    

    CSS selectors alone can handle your second question:

    # All <li> elements who are not descendents of <table class="someclass">
    CSSSelector(':not(table.someclass) li')(html)
    
    影子是时光的心 2024-10-14 05:27:41

    我不认为 CSS 选择器有“除了”选择之外的任何东西,所以你不能那样做。也许您可以使用 XPath 来做到这一点。更灵活,但即使如此,您也会得到非常复杂且迟钝的路径表达式。

    我建议您简单地获取所有

  • 元素,遍历每个元素的子元素,如果其中一个子元素是表格,则跳过它。
  • 这将很容易理解和维护,易于实现,除非你的性能要求真的非常极端,你需要每秒处理数万页,否则它会足够快(tm)。

    保持简单。

    I don't think CSS selectors have "anything but" selection, so you can't do it that way. Maybe you can do it with XPaths. which are more flexible, but even then you will get very complex and obtuse path expressions.

    I'd recommend that you simply get all <li> elements, go through each elemnts children, and skip it if one of the children is a table.

    This will be easily understandable and maintainable, easy to implement, and unless your performance requirements are really extreme and you need to process tens of thousands of pages per second, it will be Fast Enough (tm).

    Keep it simple.

    ~没有更多了~
    我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
    原文