xmlNodeGetContent 引入换行符

发布于 2024-11-18 04:34:29 字数 2905 浏览 2 评论 0原文

看来 xmlNodeGetContent 引入了不应该有的换行符。

这是一个节点转储:

ELEMENT td
  ATTRIBUTE width
    TEXT
      content=100%
  ATTRIBUTE bgcolor
    TEXT
      content=#FFFFFF
  ELEMENT font
    ATTRIBUTE face
      TEXT
        content=Arial,Helvetica
    ELEMENT font
      ATTRIBUTE color
        TEXT
          content=#0000FF
      ELEMENT font
        ATTRIBUTE size
          TEXT
            content=-1
        ELEMENT b
          ELEMENT br
  TEXT
    content= 
  ELEMENT hr
  ELEMENT font
    ATTRIBUTE face
      TEXT
        content=Arial,Helvetica
    ELEMENT font
      ATTRIBUTE color
        TEXT
          content=#0000FF
      ELEMENT font
        ATTRIBUTE size
          TEXT
            content=-1
        ELEMENT b
          TEXT
            content=love
        ELEMENT br
  TEXT
    content= 
  ELEMENT ul
    ELEMENT li
      ELEMENT small
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=s
        TEXT
          content= iubire 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=f
        TEXT
          content=; dragoste 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=f
        TEXT
          content=; scump 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=m
        TEXT
          content=; 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#0000FF
          TEXT
            content=to be in love with
        TEXT
          content= a fi #C3#AEndr#C4#83gostit de; 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#0000FF
          TEXT
            content=to send one's love
        TEXT
          content= a transmite complimente; 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#0000FF
          TEXT
            content=love affair
        TEXT
          content= amor; 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#0000FF
          TEXT
            content=lovelorn
        TEXT
          content= dezn#C4#83d#C4#83jduit
    TEXT
      content= 
    ELEMENT li
      ELEMENT small
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=vt
        TEXT
          content= a iubi; a-i pl#C4#83cea
    TEXT
      content=

这是 xmlNodeGetContent 返回的文本:

  s iubire f; dragoste f; scump
  m; to be in love
  with a fi îndrăgostit de; to send
  one's love a transmite complimente; love affair amor; lovelorn deznădăjduit
  vt a iubi; a-i
  plăcea

正如您所看到的,有不应该有的换行符。

It seems xmlNodeGetContent introduces newlines, where there shouldn't be.

This a node dump:

ELEMENT td
  ATTRIBUTE width
    TEXT
      content=100%
  ATTRIBUTE bgcolor
    TEXT
      content=#FFFFFF
  ELEMENT font
    ATTRIBUTE face
      TEXT
        content=Arial,Helvetica
    ELEMENT font
      ATTRIBUTE color
        TEXT
          content=#0000FF
      ELEMENT font
        ATTRIBUTE size
          TEXT
            content=-1
        ELEMENT b
          ELEMENT br
  TEXT
    content= 
  ELEMENT hr
  ELEMENT font
    ATTRIBUTE face
      TEXT
        content=Arial,Helvetica
    ELEMENT font
      ATTRIBUTE color
        TEXT
          content=#0000FF
      ELEMENT font
        ATTRIBUTE size
          TEXT
            content=-1
        ELEMENT b
          TEXT
            content=love
        ELEMENT br
  TEXT
    content= 
  ELEMENT ul
    ELEMENT li
      ELEMENT small
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=s
        TEXT
          content= iubire 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=f
        TEXT
          content=; dragoste 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=f
        TEXT
          content=; scump 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=m
        TEXT
          content=; 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#0000FF
          TEXT
            content=to be in love with
        TEXT
          content= a fi #C3#AEndr#C4#83gostit de; 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#0000FF
          TEXT
            content=to send one's love
        TEXT
          content= a transmite complimente; 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#0000FF
          TEXT
            content=love affair
        TEXT
          content= amor; 
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#0000FF
          TEXT
            content=lovelorn
        TEXT
          content= dezn#C4#83d#C4#83jduit
    TEXT
      content= 
    ELEMENT li
      ELEMENT small
        ELEMENT font
          ATTRIBUTE color
            TEXT
              content=#FF0000
          TEXT
            content=vt
        TEXT
          content= a iubi; a-i pl#C4#83cea
    TEXT
      content=

And this is the text as returned by xmlNodeGetContent:

  s iubire f; dragoste f; scump
  m; to be in love
  with a fi îndrăgostit de; to send
  one's love a transmite complimente; love affair amor; lovelorn deznădăjduit
  vt a iubi; a-i
  plăcea

As you can see there are line breaks where there shouldn't be.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

北凤男飞 2024-11-25 04:34:29

尽管有一些无益的评论(真的,问题是什么并不明显?),我还是想分享我的解释。

额外的行是由于 HTML Tidy 对文档进行美化造成的。

Despite the unhelpful comments (really, is not obvious what the question was?), I would like to share my explanation.

The extra lines where due to HTML Tidy prettifying the document.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文