XML 节点与 Scala 中的属性?

发布于 2024-12-14 20:11:13 字数 636 浏览 4 评论 0原文

备注:请考虑 XPath 语法死在这里,谢谢。

我有 xml 节点(实际上是 HTML),我想获取它的一个属性。

在 C# (HTMLAgilityPack) 中,我可以通过名称获取属性对象。例如,有“a”节点,我可以要求“href”属性。

在 Scala 中,xml.Node 中有“attribute”方法,但这会返回 .. 节点序列。属性就是节点?怎么可能有多个同名的属性呢?我完全困惑了。

此外,还有 xml.Attribute 类,但我没有看到它在 xml.Node 中使用。

我有 PiS 书,但 XML 章节非常浅薄。

问题

我应该如何理解请求属性并获取节点集合?

IOW:返回节点集合选项而不是返回属性有什么意义?

  • 选项 - 如果没有属性,集合应该为空,它是双倍语义
  • 集合 - 这意味着可能有多个属性,所以我很好奇在什么情况下我会得到大小>的集合1 个
  • 节点——属性是非常简单的实体,为什么这么大材小用并建议属性可以具有树结构

Remark: please consider XPath syntax dead here, thank you.

I have xml node (HTML actually), and I would like to get an attribute of it.

In C# (HTMLAgilityPack) I could get attribute object by name. For example having "a" node I could ask for "href" attribute.

In Scala there is "attribute" method within xml.Node, but this returns a sequence of.. nodes. An attribute is a node? How it is possible to have several attributes with the same name? I am completely puzzled.

Moreover there is xml.Attribute class but I don't see it used in xml.Node.

I have PiS book but XML chapter is very shallow.

The question

How should I understand asking for an attribute an getting collection of nodes?

IOW: what sense is in returning an option of collection of nodes instead of returning attribute?

  • option -- if there is no attribute, collection should be empty, it is doubling semantics
  • collection -- this implies there are multiple attribute possible, so I am curious in what scenario I get collection of size > 1
  • node -- attribute is pretty simply entity, why such overkill and suggesting that attribute can have tree structure

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

心意如水 2024-12-21 20:11:13

您只想获取属性的值,是吗?在这种情况下,这很简单:

scala> val x = <foo this="xx" that="yy" />
x: scala.xml.Elem = <foo this="xx" that="yy"></foo>

scala> x.attribute("this")
res0: Option[Seq[scala.xml.Node]] = Some(xx)

scala> x.attribute("this").get.toString
res1: String = xx

我知道您说过您明确对 XPath 语法不感兴趣,但在这种情况下,它确实相当简洁:

scala> x \ "@this"
res2: scala.xml.NodeSeq = xx

说了所有这些,您应该意识到属性存在很多问题Scala 的内置 XML 处理中的处理。例如,请参阅这个这个

You just want to get the value of an attribute, yes? In which case that's pretty easy:

scala> val x = <foo this="xx" that="yy" />
x: scala.xml.Elem = <foo this="xx" that="yy"></foo>

scala> x.attribute("this")
res0: Option[Seq[scala.xml.Node]] = Some(xx)

scala> x.attribute("this").get.toString
res1: String = xx

I know that you said that you explicitly aren't interested in XPath syntax, but in this instance it really is rather neater:

scala> x \ "@this"
res2: scala.xml.NodeSeq = xx

Having said all of this, you should be aware that there are many problems with attribute handling in Scala's built-in XML handling. See, for example, this, this and this.

温馨耳语 2024-12-21 20:11:13

我意识到 Paul 的后续回答几乎涵盖了您的问题,但我想补充几点:

  1. 我个人不喜欢 Scala XML 的设计,以至于我编写了一个替代库 Scales Xml,但我不会说它设计得很糟糕。它的设计元素显然也足以构成 Anti-Xml 方法的基础(拥有其子元素的元素、分组节点的概念等),但有很多怪癖 - 作为容器的属性和文本是一个很大的怪癖。
  2. 我最近才将后代轴提交给 Scales - 它贪婪的本质与后代或自我不同 - 根据规范 //para1 与位置路径 /descendant::para1
  3. 我是不确定您是否可以将糟糕的设计归因于 Anti-Xml 的缺失,它是一个年轻的项目(仅仅七个月多了?),而且他们可能还没有抽出时间来添加后代。

Scales 属性问题的直接答案是:

val pre = Namespace("uri:test").prefixed("pre")

val elem = Elem("fred"l, emptyAttributes + 
        ("attr", "value") +
        Attribute(pre("attr"), "value"))

println("attributes are a map " + elem.attributes("attr"))

println("attributes are a set " + (
  elem.attributes + ("attr", "new value")))

val xpath = top(elem) \@ pre("attr")

xpath foreach{ap => println(ap.name)}

XPath语法

[info] attributes are a map Some(Attribute({}attr,value))
[info] attributes are a set ListSet(Attribute({}attr,new value), Attribute({uri:test}attr,value))
[info] {uri:test}attr

必须返回一个集合,因为它可以是到达匹配属性的任意数量的路径。元素属性本身是 QName 匹配“attr”,意味着没有命名空间和 attr 的本地名称。为了额外的理智,属性 QName 是:

type AttributeQName = EitherLike[PrefixedQName, NoNamespaceQName]

编译器确保没有本地名称,只有 QNames 潜入。

顺便说一句,虽然我理解为什么 Scala XML XPath 之类的语法可能不有趣,但您应该看看基于 XPath 的查询的 Scales。

有基于 XPath 1.0 字符串的查询(尚未推送到非快照版本)和内部 dsl,可以让编译器/ide 帮助您(加上速度更快并直接使用 scala 代码的好处)。

I realise that Paul's follow up answer pretty much covers your question but I'd just like to add a few more points:

  1. I personally don't like the design of Scala XML, to the extent that I wrote an alternative library Scales Xml, but I wouldn't call it badly designed. Design elements of it are apparently also good enough to form the basis of Anti-Xml's approach (Elements owning their children, a concept of grouping nodes etc), but there are many quirks - attribute and text as containers being a large one.
  2. I've only recently committed descendant axis to Scales - its greedy nature works differently than descendant-or-self - as per the spec //para1 does not mean the same as the location path /descendant::para1
  3. I'm not sure you can attribute bad design to Anti-Xml either for its absence, its a young project (just over seven months old?) and they may simply not have gotten round to adding descendant yet.

Direct answer for the attribute question for Scales is:

val pre = Namespace("uri:test").prefixed("pre")

val elem = Elem("fred"l, emptyAttributes + 
        ("attr", "value") +
        Attribute(pre("attr"), "value"))

println("attributes are a map " + elem.attributes("attr"))

println("attributes are a set " + (
  elem.attributes + ("attr", "new value")))

val xpath = top(elem) \@ pre("attr")

xpath foreach{ap => println(ap.name)}

giving

[info] attributes are a map Some(Attribute({}attr,value))
[info] attributes are a set ListSet(Attribute({}attr,new value), Attribute({uri:test}attr,value))
[info] {uri:test}attr

The XPath syntax must return a collection as it could be any number of paths that reached a matching attribute. Element Attributes themselves are QName matched "attr" meaning no namespace and localName of attr. For additional sanity an attribute QName is:

type AttributeQName = EitherLike[PrefixedQName, NoNamespaceQName]

The compiler makes sure no local name only QNames creep in.

As an aside, whilst I understand why the Scala XML XPath like syntax is probably uninteresting, you should have a look at Scales for XPath based querying.

There is both XPath 1.0 string based querying (not yet pushed into a non snapshot version) and an internal dsl that lets the compiler / ide help you out (plus the bonus of being far quicker and working with scala code directly).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文