用 BR 替换 unicode 换行符

发布于 2024-10-16 13:44:53 字数 617 浏览 9 评论 0原文

在我的 XML 文件中，有像此屏幕截图所示的 unicode 换行符。使用此链接查看屏幕截图

“minds”后面的两个点。是换行符。我用谷歌搜索并尝试了几乎所有我知道的东西来用 ruby (1.8) 替换它们，但没有任何运气。

这是我的代码（使用不同的 unicode 尝试），也许有人可以帮助我。

def formatedBody
  t = self.body.gsub("\u000a","<br/>")
  t = t.gsub("\u000d","<br/>")
  t = t.gsub("\u0009","<br/>")
  t = t.gsub("\u000c","<br/>")
  t = t.gsub("\u0085","<br/>")
  t = t.gsub("\u2028","<br/>")
  t = t.gsub("\u2029","<br/>")
  t = t.gsub(/0A\0A/u,"<br/>")
  return t
end

原文

In my XML files, there are unicode line breaks like shown in this screenshot.
Use this link to see the screenshot

bigger screenshot

The two dots after "minds." is the line break. I've googled and tried almost everything I know to replace them with ruby (1.8) but without any luck.

Here's my code (with different tries of unicodes), maybe someone could help me.

def formatedBody
  t = self.body.gsub("\u000a","<br/>")
  t = t.gsub("\u000d","<br/>")
  t = t.gsub("\u0009","<br/>")
  t = t.gsub("\u000c","<br/>")
  t = t.gsub("\u0085","<br/>")
  t = t.gsub("\u2028","<br/>")
  t = t.gsub("\u2029","<br/>")
  t = t.gsub(/0A\0A/u,"<br/>")
  return t
end

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

清浅ˋ旧时光 2024-10-23 13:44:53

两个 0x0A 值是换行符的十六进制表示。常规 ol' ASCII 换行符，又称字符串中的 "\n\n"。

因此，t = t.gsub(/\n/, " ") 应该可以工作。

t = "foo\u000d\u0009\u000c\u0085\u2028\u2029\nbar"
p t

t = t.gsub(/\u000d|\u0009|\u000c|\u0085|\u2028|\u2029|\n/,"<br/>")
puts t

您可以将 OR 字符列表替换为：

t = t.gsub(/[\u000d\u0009\u000c\u0085\u2028\u2029\n]/,"<br/>")

无论哪种方式，输出都会如下所示：

"foo\r\t\f\u2028\u2029\nbar"
foo<br/><br/><br/><br/><br/><br/><br/>bar

The Reason your

t = t.gsub(/0A\0A/u,"<br/>")

does not work is the regex is not valid.

t = t.gsub(/\x0A/,"<br/>")

是定义的另一种方式：

t = t.gsub(/\n/,"<br/>")

The two 0x0A values are the hex representation of line-feeds. Regular ol' ASCII line feeds, AKA "\n\n" in a string.

So, t = t.gsub(/\n/, "<br/>") should work.

t = "foo\u000d\u0009\u000c\u0085\u2028\u2029\nbar"
p t

t = t.gsub(/\u000d|\u0009|\u000c|\u0085|\u2028|\u2029|\n/,"<br/>")
puts t

You can replace the list of OR'd characters with:

t = t.gsub(/[\u000d\u0009\u000c\u0085\u2028\u2029\n]/,"<br/>")

Either way, the output would look like:

"foo\r\t\f\u2028\u2029\nbar"
foo<br/><br/><br/><br/><br/><br/><br/>bar

The reason your

t = t.gsub(/0A\0A/u,"<br/>")

doesn't work is the regex is not correct.

t = t.gsub(/\x0A/,"<br/>")

is an alternate way of defining:

t = t.gsub(/\n/,"<br/>")

回复收藏 0 原文

霓裳挽歌倾城醉 2024-10-23 13:44:53

我遇到了同样的问题（使用 ruby 1.8.7），我只是用以下方法解决了它：

t = t.gsub(/\xE2\x80(?:\xA8|\xA9)/, '<br/>')

I had the same issue (using ruby 1.8.7) and I simply solve it with:

t = t.gsub(/\xE2\x80(?:\xA8|\xA9)/, '<br/>')

回复收藏 0 原文

~没有更多了~

关于作者

摇划花蜜的午后

暂无简介

文章

27 人气

关注发私信

Mr.HU

文章 0 评论 0

关注

疯到世界奔溃

文章 0 评论 0

关注

隔纱相望

文章 0 评论 0

关注

萌无敌

文章 0 评论 0

关注

梦幻的味道

文章 0 评论 0

关注

自在安然

文章 0 评论 0

友情链接

文江博客

用 BR 替换 unicode 换行符

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

Mr.HU

疯到世界奔溃

隔纱相望

萌无敌

梦幻的味道

自在安然

友情链接

用 BR 替换 unicode 换行符

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

Mr.HU

疯到世界奔溃

隔纱相望

萌无敌

梦幻的味道

自在安然

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。