Scala - xml 转换丢弃具有多个属性的转换元素

发布于 2024-10-05 07:35:06 字数 2976 浏览 5 评论 0原文

这有点奇怪，我很想认为这可能是一个错误。背景 - 我有一整套 XML，如下所示：

<foo id="1">
 <bar id="0">
  <baz id="0" blah="blah" etc="etc">
   <buz id="0" />
  </baz>
  <buz id="0" blah="blah" etc="etc">
   ...
  </buz>
 </bar>
</foo>
<foo id="2">
 <bar id="0">
  <baz id="0" blah="blah" etc="etc">
   <buz id="0" />
  </baz>
  <buz id="0" blah="blah" etc="etc">
   ...
  </buz>
 </bar>
</foo>
....

我想要做的是对其进行转换，以便对于每个 foo 元素，我将其中的所有零 id 替换为 foo 的 id。

为此 - 首先，我使用 Daniel 的代码以及隐式转换，为“Elem”提供 mapAttributes 方法：

class ElemWithUsefulAttributes(elem : Elem) extends Elem(elem.prefix, elem.label, elem.attributes, elem.scope, elem.child : _*) {
def mapAttributes(f : GenAttr => GenAttr) = this.copy(attributes = mapMetaData(elem.attributes)(f))
}

implicit def Elem2ElemWithUsefulAttributes(elem : Elem) = new ElemWithUsefulAttributes(elem)

然后我有一个“replaceId”方法：

def replaceId(attr : String, id : String)(in : GenAttr) = in match {
  case g@GenAttr(_,key,Text(v),_) if (key.equals(attr)) => g.copy(value=Text(id))
  case other => other
}

最后，我构造了几个 RewriteRule 和相应的 RuleTransformer 来处理此问题：

class rw1(id : String) extends RewriteRule {
  override def transform(n : Node) : Seq[Node] = n match {
    case n2: Elem if (n2.label == "bar") => n2.mapAttributes(replaceId("id", id))
    case n2: Elem if (n2.label == "baz") => n2.mapAttributes(replaceId("id", id))
    case n2: Elem if (n2.label == "buz") => n2.mapAttributes(replaceId("id", id))
    case other => other
  }
}
class rt1(id : String) extends RuleTransformer(new rw1(id))
object rw2 extends RewriteRule {
  override def transform(n : Node) : Seq[Node] = n match {
    case n2@Elem(_, "foo", _, _, _*) => (new rw1(n2.attribute("id").get.toString))(n2)
    case other => other
  }
}
val rt2 = new RuleTransformer(rw2)

调用 rt2(xml) 后，我得到如下所示的输出：

<foo id="1">
 <bar id="1">
  <baz id="0" blah="blah" etc="etc">
   <buz id="1" />
  </baz>
  <buz id="0" blah="blah" etc="etc">
   ...
  </buz>
 </bar>
</foo>
<foo id="2">
 <bar id="2">
  <baz id="0" blah="blah" etc="etc">
   <buz id="2" />
  </baz>
  <buz id="0" blah="blah" etc="etc">
   ...
  </buz>
 </bar>
</foo>
....

换句话说，在存在多个的情况下，属性尚未更改属性。自然地，人们会认为这是 mapAttributes 代码的问题 - 但是，如果我在 rw1 中的“match”语句中转储转换结果，我可以清楚地看到它显示

  <baz id="2" blah="blah" etc="etc">
   <buz id="2" />
  </baz>

：，如果我改变一个 baz，比如说，删除额外的属性，那么它就能正常工作并被映射。所以这似乎是多个属性和变换的奇怪组合。

有人能明白这一点吗？

原文

Bit of a bizarre one, this, and I'm tempted to assume it may be a bug. Background - I have a whole load of XML that looks like this:

<foo id="1">
 <bar id="0">
  <baz id="0" blah="blah" etc="etc">
   <buz id="0" />
  </baz>
  <buz id="0" blah="blah" etc="etc">
   ...
  </buz>
 </bar>
</foo>
<foo id="2">
 <bar id="0">
  <baz id="0" blah="blah" etc="etc">
   <buz id="0" />
  </baz>
  <buz id="0" blah="blah" etc="etc">
   ...
  </buz>
 </bar>
</foo>
....

What I want to do is to transform this such that, for each foo element, I replace all the zero ids inside it with the id of foo.

To do this - firstly, I'm using Daniel's code along with an implicit conversion to provide 'Elem' with a mapAttributes method:

class ElemWithUsefulAttributes(elem : Elem) extends Elem(elem.prefix, elem.label, elem.attributes, elem.scope, elem.child : _*) {
def mapAttributes(f : GenAttr => GenAttr) = this.copy(attributes = mapMetaData(elem.attributes)(f))
}

implicit def Elem2ElemWithUsefulAttributes(elem : Elem) = new ElemWithUsefulAttributes(elem)

Then I've got a 'replaceId' method:

def replaceId(attr : String, id : String)(in : GenAttr) = in match {
  case g@GenAttr(_,key,Text(v),_) if (key.equals(attr)) => g.copy(value=Text(id))
  case other => other
}

Finally, I construct a couple of RewriteRules and corresponding RuleTransformers to deal with this:

class rw1(id : String) extends RewriteRule {
  override def transform(n : Node) : Seq[Node] = n match {
    case n2: Elem if (n2.label == "bar") => n2.mapAttributes(replaceId("id", id))
    case n2: Elem if (n2.label == "baz") => n2.mapAttributes(replaceId("id", id))
    case n2: Elem if (n2.label == "buz") => n2.mapAttributes(replaceId("id", id))
    case other => other
  }
}
class rt1(id : String) extends RuleTransformer(new rw1(id))
object rw2 extends RewriteRule {
  override def transform(n : Node) : Seq[Node] = n match {
    case n2@Elem(_, "foo", _, _, _*) => (new rw1(n2.attribute("id").get.toString))(n2)
    case other => other
  }
}
val rt2 = new RuleTransformer(rw2)

After calling rt2(xml), I get output that looks like the following:

<foo id="1">
 <bar id="1">
  <baz id="0" blah="blah" etc="etc">
   <buz id="1" />
  </baz>
  <buz id="0" blah="blah" etc="etc">
   ...
  </buz>
 </bar>
</foo>
<foo id="2">
 <bar id="2">
  <baz id="0" blah="blah" etc="etc">
   <buz id="2" />
  </baz>
  <buz id="0" blah="blah" etc="etc">
   ...
  </buz>
 </bar>
</foo>
....

In other words, the attributes haven't been changed where there are multiple attributes. Naturally, one wants to suppose this is a problem with the mapAttributes code - however, if I dump out the results of the transform in the 'match' statement in rw1, I can clearly see it showing:

  <baz id="2" blah="blah" etc="etc">
   <buz id="2" />
  </baz>

Further, if I alter a single baz, say, to remove the extra attributes then it works correctly and gets mapped. So it seems to be a strange combination of the multiple attributes and the transforms.

Can anybody make head or tail of this?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

亂 2024-10-12 07:35:06

难道您不应该将 new rw1 包装在 new RuleTransformer 中吗？否则，我认为它只会将规则应用于 foo 节点，而不是它的子节点。

也许仅此一点就能解决问题。如果没有，这里有一些其他代码似乎可以做你想做的事情（使用我更改属性的方法）：

class ReplaceIDsBelowFooRule(id: String) extends RewriteRule {
  override def transform(n : Node) : Seq[Node] = n match {
    case elem: Elem if elem.label == "foo" =>
      elem.copy(child=this.replaceDescendantIDs(elem.child))
    case other => other
  }
  def replaceDescendantIDs(nodes: Seq[Node]): Seq[Node] = {
    for (node <- nodes) yield node match {
      case elem: Elem => elem.copy(
        child=this.replaceDescendantIDs(elem.child),
        attributes=
          for (attr <- elem.attributes) yield attr match {
            case attr@Attribute("id", _, _) => attr.goodcopy(value=this.id)
            case other => other
          }
      )
      case other => other
    }
  }
}

并应用于您的xml：

scala> new RuleTransformer(new ReplaceIDsBelowFooRule("XXX"))(xml)
res1: scala.xml.Node = 
<xml>
<foo id="1">
 <bar id="XXX">
  <baz blah="blah" etc="etc" id="XXX">
   <buz id="XXX"></buz>
  </baz>
  <buz blah="blah" etc="etc" id="XXX">
   ...
  </buz>
 </bar>
</foo>
<foo id="2">
 <bar id="XXX">
  <baz blah="blah" etc="etc" id="XXX">
   <buz id="XXX"></buz>
  </baz>
  <buz blah="blah" etc="etc" id="XXX">
   ...
  </buz>
 </bar>
</foo>
</xml>

Shouldn't you be wrapping your new rw1 in a new RuleTransformer? Otherwise I think it's only going to apply the rule to the foo node and not its children.

Maybe that alone would solve the problem. In case it doesn't here's some other code that seems to do what you want (using my approach to changing attributes):

class ReplaceIDsBelowFooRule(id: String) extends RewriteRule {
  override def transform(n : Node) : Seq[Node] = n match {
    case elem: Elem if elem.label == "foo" =>
      elem.copy(child=this.replaceDescendantIDs(elem.child))
    case other => other
  }
  def replaceDescendantIDs(nodes: Seq[Node]): Seq[Node] = {
    for (node <- nodes) yield node match {
      case elem: Elem => elem.copy(
        child=this.replaceDescendantIDs(elem.child),
        attributes=
          for (attr <- elem.attributes) yield attr match {
            case attr@Attribute("id", _, _) => attr.goodcopy(value=this.id)
            case other => other
          }
      )
      case other => other
    }
  }
}

And applied to your xml:

scala> new RuleTransformer(new ReplaceIDsBelowFooRule("XXX"))(xml)
res1: scala.xml.Node = 
<xml>
<foo id="1">
 <bar id="XXX">
  <baz blah="blah" etc="etc" id="XXX">
   <buz id="XXX"></buz>
  </baz>
  <buz blah="blah" etc="etc" id="XXX">
   ...
  </buz>
 </bar>
</foo>
<foo id="2">
 <bar id="XXX">
  <baz blah="blah" etc="etc" id="XXX">
   <buz id="XXX"></buz>
  </baz>
  <buz blah="blah" etc="etc" id="XXX">
   ...
  </buz>
 </bar>
</foo>
</xml>

回复收藏 0 原文

征﹌骨岁月お 2024-10-12 07:35:06

问题可能在这里：

case other => other

相反，尝试：

case elem: Elem => elem.copy(child = transform(elem.child))
case other => other

我想知道库是否在 2.7 和 2.8 之间没有被破坏，使得转换不会自动递归到每个子项中。有一天我必须查一下（当然，欢迎其他人的目光:)。

The problem might be here:

case other => other

Instead, try:

case elem: Elem => elem.copy(child = transform(elem.child))
case other => other

I'm wondering if the library wasn't broken between 2.7 and 2.8, making transform not recurse automatically into each child. I have to look that up someday (though, of course, other eyes are welcome :).

回复收藏 0 原文

~没有更多了~