Scala XML.loadString 与文字表达式

发布于 2024-10-06 13:14:58 字数 1558 浏览 7 评论 0原文

我一直在尝试使用 Scala 和 XML,发现使用 XML.load(或 loadString)创建的 XML 标记与将其写为文字之间的行为存在奇怪的差异。这是代码:

import scala.xml._
// creating a classical link HTML tag
val in_xml = <link type="text/css" href="/css/main.css" rel="stylesheet" xmlns="http://www.w3.org/1999/xhtml"></link>
// The same as a String
val in_str = """<link type="text/css" href="/css/main.css" rel="stylesheet" xmlns="http://www.w3.org/1999/xhtml"></link>"""
// Convert the String into XML
val from_str = XML.loadString(in_str)

println("in_xml  : " + in_xml)
println("from_str: "+ from_str)
println("val_xml == from_str: "+ (in_xml == from_str))
println("in_xml.getClass() == from_str.getClass(): " +
  (in_xml.getClass() == from_str.getClass()))

这里是输出:

in_xml  : <link href="/css/main.css" rel="stylesheet" type="text/css" xmlns="http://www.w3.org/1999/xhtml"></link>
from_str: <link rel="stylesheet" href="/css/main.css" type="text/css" xmlns="http://www.w3.org/1999/xhtml"></link>
val_xml == from_str: false
in_xml.getClass() == from_str.getClass(): true

类型是相同的。但没有平等。属性的顺序发生变化。它永远不会与原来的一样。文字的属性按字母顺序排序(只有危险?)。

如果当我尝试转换两种解决方案时,它们的行为没有不同,那么这不会成为问题。我从 Daniel C. Sobral 那里获得了一些有趣的代码 如何更改属性在 Scala XML 元素 上并编写了我自己的规则,以删除“href”属性的第一个斜杠。 RuleTransformer 与 in_xml 配合良好,但对 from_str 没有影响!

不幸的是,我的大多数程序都必须通过 XML.load(...) 读取 XML。所以,我被困住了。有人知道这个话题吗?

最好的问候,

亨利

I have been experimenting with Scala and XML and I found a strange difference in behavior between a XML tag created with XML.load (or loadString) and writing it as a literal. Here is the code :

import scala.xml._
// creating a classical link HTML tag
val in_xml = <link type="text/css" href="/css/main.css" rel="stylesheet" xmlns="http://www.w3.org/1999/xhtml"></link>
// The same as a String
val in_str = """<link type="text/css" href="/css/main.css" rel="stylesheet" xmlns="http://www.w3.org/1999/xhtml"></link>"""
// Convert the String into XML
val from_str = XML.loadString(in_str)

println("in_xml  : " + in_xml)
println("from_str: "+ from_str)
println("val_xml == from_str: "+ (in_xml == from_str))
println("in_xml.getClass() == from_str.getClass(): " +
  (in_xml.getClass() == from_str.getClass()))

And here, the output :

in_xml  : <link href="/css/main.css" rel="stylesheet" type="text/css" xmlns="http://www.w3.org/1999/xhtml"></link>
from_str: <link rel="stylesheet" href="/css/main.css" type="text/css" xmlns="http://www.w3.org/1999/xhtml"></link>
val_xml == from_str: false
in_xml.getClass() == from_str.getClass(): true

The types are the same. But there is not equality. The order of the attributes changes. It is never the same as the original one. The attributes of the litteral are alphabetically sorted (only hazard ?).

This would not be a problem if both solutions did not behave differently when I try to transform them. I picked up some intresting Code from Daniel C. Sobral at How to change attribute on Scala XML Element and wrote my own rule in order to remove the first slash of the "href" attribute. The RuleTransformer works well with the in_xml, but has no effect on from_str !

Unfortunately, most of my programs have to read there XML via XML.load(...). So, I'm stuck. Does someone know about this topic ?

Best regards,

Henri

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

鹿童谣 2024-10-13 13:14:58

据我所见, in_xmlfrom_str 不相等,因为属性的顺序不同。这是不幸的,这是由于编译器创建 XML 的方式造成的。这会导致属性不同:

scala> in_xml.attributes == from_str.attributes
res30: Boolean = false

您可以看到,如果替换属性,比较就会起作用:

scala> in_xml.copy(attributes=from_str.attributes) == from_str
res32: Boolean = true

话虽如此,我不清楚为什么这会导致替换 href< 的代码中出现不同的行为/代码> 属性。事实上,我怀疑属性映射的工作方式有问题。例如,如果我将 in_str 替换为:

val in_str = """<link type="text/css" rel="stylesheet" href="/css/main.css" 
xmlns="http://www.w3.org/1999/xhtml"></link>"""

它工作正常。难道大牛的属性码只有在MetaData的头部位置才起作用?


旁注:除非 in_xmlnullequals== 将返回相同的值。 == 版本将在调用 equals 之前检查第一个操作数是否为 null。

From what I can see, in_xml and from_str are not equals because the order of the attributes is different. This is unfortunate and due to the way the XML is created by the compiler. That causes the attributes to be different:

scala> in_xml.attributes == from_str.attributes
res30: Boolean = false

You can see see that if you replace the attributes the comparison will work:

scala> in_xml.copy(attributes=from_str.attributes) == from_str
res32: Boolean = true

With that said, I'm not clear why that would cause a different behavior in the code that replaces the href attribute. In fact I suspect that something is wrong with the way attribute mapping works. For instance, if I replace the in_str with:

val in_str = """<link type="text/css" rel="stylesheet" href="/css/main.css" 
xmlns="http://www.w3.org/1999/xhtml"></link>"""

It works fine. Could it be that the attribute code from Daniel only works if the attribute is in the head position of MetaData?


Side note: unless in_xml is null, equals and == would return the same value. The == version will check whether the first operand is null before calling equals.

把回忆走一遍 2024-10-13 13:14:58

一些进一步的测试:
也许,我最初的相等测试不合适:

in_xml == from_str

如果我测试:

in_xml.equals(in_xml)

我也会得到错误。也许,我应该使用另一种测试方法(例如对应,但我没有找到应该使用什么谓词作为第二个参数...)

也就是说,如果我在 REPL 中测试以下内容

<body id="1234"></body> == XML.loadString("<body id=\"1234\"></body>")

,即使不调用equals 方法...

回到我最初的示例:我定义了一个重写规则

def unSlash(s: String) = if (s.head == '/') s.tail else s
val changeCSS = new RewriteRule {
    override def transform(n: Node): NodeSeq = n match {
        case e: Elem if (n \ "@rel").text == "stylesheet" =>
            e.copy(attributes = mapMetaData(e.attributes) {
                case g @ GenAttr(_, key, Text(v), _) if key == "href" =>
                    g.copy(value = Text(unSlash(v)))
                case other => other
            })
        case n => n
    }
}

它使用 Daniel C. Sobral 在 如何更改 Scala XML 元素上的属性。如果我申请:

new RuleTransformer(changeCSS).transform(in_xml)
new RuleTransformer(removeComments).transform(from_str)

我使用 in_xml 得到预期结果,但没有使用 from_str 进行修改...

Some further testing:
Maybe, my initial equality test is not appropriate:

in_xml == from_str

and if I test :

in_xml.equals(in_xml)

I get also get false. Maybe, I should use another testing method (like corresponds, but I did not find out what a predicate I should use as second parameter...)

That said, if I test the following in the REPL

<body id="1234"></body> == XML.loadString("<body id=\"1234\"></body>")

I get true, even without calling the equals method...

Back to my initial example: I defined a rewrite rule

def unSlash(s: String) = if (s.head == '/') s.tail else s
val changeCSS = new RewriteRule {
    override def transform(n: Node): NodeSeq = n match {
        case e: Elem if (n \ "@rel").text == "stylesheet" =>
            e.copy(attributes = mapMetaData(e.attributes) {
                case g @ GenAttr(_, key, Text(v), _) if key == "href" =>
                    g.copy(value = Text(unSlash(v)))
                case other => other
            })
        case n => n
    }
}

It uses the helper classes/methods defined by Daniel C. Sobral at How to change attribute on Scala XML Element. If I apply:

new RuleTransformer(changeCSS).transform(in_xml)
new RuleTransformer(removeComments).transform(from_str)

I get the expected result with in_xml, but no modification with from_str...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文