拆分为不同的换行符

发布于 2024-11-17 21:13:52 字数 208 浏览 2 评论 0原文

现在我正在对字符串进行 split 并假设用户的换行符是 \r\n ，如下所示：

string.split(/\r\n/)

我想做的是拆分位于 \r\n 或仅 \n 上。

那么正则表达式将如何分割其中的任何一个呢？

原文

Right now I'm doing a split on a string and assuming that the newline from the user is \r\n like so:

string.split(/\r\n/)

What I'd like to do is split on either \r\n or just \n.

So how what would the regex be to split on either of those?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

但可醉心 2024-11-24 21:13:52

您尝试过 /\r?\n/ 吗？ ? 使 \r 成为可选。

用法示例： http://rubular.com/r/1ZuihD0YfF

回复收藏 0 原文

幽蝶幻影 2024-11-24 21:13:52

Ruby 有方法 String#each_line 和 String#lines

返回一个枚举：
http://www.ruby-doc.org/core-1.9。 3/String.html#method-i-each_line

返回一个数组：
http://www.ruby-doc.org/core-2.1。 2/String.html#method-i-lines

我没有针对您的场景进行测试，但我敢打赌它会比手动选择换行符更好。

回复收藏 0 原文

浅忆流年 2024-11-24 21:13:52

# Split on \r\n or just \n
string.split( /\r?\n/ )

尽管它对解决这个问题没有帮助（您确实需要正则表达式），但请注意 String#split 不需要正则表达式参数。您的原始代码也可能是 string.split( "\r\n" )。

# Split on \r\n or just \n
string.split( /\r?\n/ )

Although it doesn't help with this question (where you do need a regex), note that String#split does not require a regex argument. Your original code could also have been string.split( "\r\n" ).

回复收藏 0 原文

有深☉意 2024-11-24 21:13:52

\n is for unix 
\r is for mac 
\r\n is for windows format

为了操作系统的安全。我会这样做 /\r?\n|\r\n?/

"1\r2\n3\r\n4\n\n5\r\r6\r\n\r\n7".split(/\r?\n|\r\n?/)
=> ["1", "2", "3", "4", "", "5", "", "6", "", "7"]

\n is for unix 
\r is for mac 
\r\n is for windows format

To be safe for operating systems. I would do /\r?\n|\r\n?/

"1\r2\n3\r\n4\n\n5\r\r6\r\n\r\n7".split(/\r?\n|\r\n?/)
=> ["1", "2", "3", "4", "", "5", "", "6", "", "7"]

回复收藏 0 原文

不再让梦枯萎 2024-11-24 21:13:52

Ruby Regexp 中的交替运算符与标准正则表达式中的相同：|

因此，显而易见的解决方案

/\r\n|\n/

相同

/\r?\n/

是与可选的 \r 后跟强制 \n。

The alternation operator in Ruby Regexp is the same as in standard regular expressions: |

So, the obvious solution would be

/\r\n|\n/

which is the same as

/\r?\n/

i.e. an optional \r followed by a mandatory \n.

回复收藏 0 原文

别忘他 2024-11-24 21:13:52

~~您是从文件读取还是从标准输入读取？~~

如果您正在从文件读取，并且该文件处于文本模式而不是二进制模式，或者您正在从标准输入读取，您不必处理 \r\n - 它看起来就像 \n 一样。

C:\Documents and Settings\username>irb
irb(main):001:0> gets
foo
=> "foo\n"

~~Are you reading from a file, or from standard in?~~

If you're reading from a file, and the file is in text mode, rather than binary mode, or you're reading from standard in, you won't have to deal with \r\n - it'll just look like \n.

C:\Documents and Settings\username>irb
irb(main):001:0> gets
foo
=> "foo\n"

回复收藏 0 原文

凤舞天涯 2024-11-24 21:13:52

也许只对“\n”进行拆分并删除“\r”（如果存在）？

回复收藏 0 原文

浪荡不羁 2024-11-24 21:13:52

另一种选择是使用 String#chomp，它还可以自行智能地处理换行符。

您可以通过以下方式完成您所追求的目标：

lines = string.lines.map(&:chomp)

或者，如果您正在处理足够大的问题而需要考虑内存使用问题：

<string|io>.each_line do |line|
  line.chomp!
  #  do work..
end

在解决此类问题时，性能并不总是最重要的，但值得注意的是解决方案也比使用正则表达式快一点。

在我的机器上（i7，ruby 2.1.9）：

Warming up --------------------------------------
           map/chomp    14.715k i/100ms
  split custom regex    12.383k i/100ms
Calculating -------------------------------------
           map/chomp    158.590k (± 4.4%) i/s -    794.610k in   5.020908s
  split custom regex    128.722k (± 5.1%) i/s -    643.916k in   5.016150s

Another option is to use String#chomp, which also handles newlines intelligently by itself.

You can accomplish what you are after with something like:

lines = string.lines.map(&:chomp)

Or if you are dealing with something large enough that memory use is a concern:

<string|io>.each_line do |line|
  line.chomp!
  #  do work..
end

Performance isn't always the most important thing when solving this kind of problem, but it is worth noting the chomp solution is also a bit faster than using a regex.

On my machine (i7, ruby 2.1.9):

Warming up --------------------------------------
           map/chomp    14.715k i/100ms
  split custom regex    12.383k i/100ms
Calculating -------------------------------------
           map/chomp    158.590k (± 4.4%) i/s -    794.610k in   5.020908s
  split custom regex    128.722k (± 5.1%) i/s -    643.916k in   5.016150s

回复收藏 0 原文

~没有更多了~