Java 扫描仪问题

发布于 2024-08-16 13:16:15 字数 110 浏览 3 评论 0原文

如何将扫描仪的分隔符设置为 ;或新线?

我试过: Scanner.useDelimiter(Pattern.compile("(\n)|;")); 但这不起作用。

How do you set the delimiter for a scanner to either ; or new line?

I tried:
Scanner.useDelimiter(Pattern.compile("(\n)|;"));
But it doesn't work.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

情归归情 2024-08-23 13:16:15

作为一般规则,在模式中,您需要将 \ 加倍。

因此,请尝试

Scanner.useDelimiter(Pattern.compile("(\\n)|;"));

Scanner.useDelimiter(Pattern.compile("[\\n;]"));

编辑:如果 \r\n 是问题所在,您可能需要尝试以下操作:

Scanner.useDelimiter(Pattern.compile("[\\r\\n;]+"));

匹配一个或多个 <代码>\r、\n;

注意:我还没有尝试过这些。

As a general rule, in patterns, you need to double the \.

So, try

Scanner.useDelimiter(Pattern.compile("(\\n)|;"));

or

Scanner.useDelimiter(Pattern.compile("[\\n;]"));

Edit: If \r\n is the problem, you might want to try this:

Scanner.useDelimiter(Pattern.compile("[\\r\\n;]+"));

which matches one or more of \r, \n, and ;.

Note: I haven't tried these.

一笔一画续写前缘 2024-08-23 13:16:15

正如您所发现的,您需要查找 DOS/网络​​样式 \r\n (CRLF) 行分隔符,而不是 Unix 样式 \n(仅限 LF)。但如果文本同时包含两者呢?这种情况经常发生;事实上,当我查看这个页面的源代码时,我看到了这两种类型。

您应该养成寻找两种分隔符以及较旧的 Mac 样式 \r (仅限 CR)的习惯。这是一种方法:

\r?\n|\r

将其插入到您得到的示例代码中:

scanner.useDelimiter(";|\r?\n|\r");

这是假设您希望一次精确匹配一个换行符或分号。如果您想匹配一个或多个,您可以这样做:

scanner.useDelimiter("[;\r\n]+");

另请注意,我如何传入正则表达式字符串而不是模式;所有正则表达式都会自动缓存,因此预编译正则表达式不会给您带来任何性能提升。

As you've discovered, you needed to look for DOS/network style \r\n (CRLF) line separators instead of the Unix style \n (LF only). But what if the text contains both? That happens a lot; in fact, when I view the source of this very page I see both varieties.

You should get in the habit of looking for both kinds of separator, as well as the older Mac style \r (CR only). Here's one way to do that:

\r?\n|\r

Plugging that into your sample code you get:

scanner.useDelimiter(";|\r?\n|\r");

This is assuming you want to match exactly one newline or semicolon at a time. If you want to match one or more you can do this instead:

scanner.useDelimiter("[;\r\n]+");

Notice, too, how I passed in a regex string instead of a Pattern; all regexes get cached automatically, so pre-compiling the regex doesn't get you any performance gain.

三生池水覆流年 2024-08-23 13:16:15

看看OP的评论,看起来问题在于不同的行结尾(\r\n或CRLF)。

这是我的答案,它将处理任一格式的多个分号和行结尾(可能需要也可能不需要),

Scanner.useDelimiter(Pattern.compile("([\n;]|(\r\n))+"));

例如一个如下所示的输入文件:

1


2;3;;4
5

将导致 1,2,3,4,5

我尝试了正常的 \n 和\\n - 两者都适用于我的情况,但我同意如果您需要一个普通的反斜杠,您会想要将其加倍,因为它是转义字符。碰巧在这种情况下,“\n”成为所需的字符,带或不带额外的“\”

Looking at the OP's comment, it looks like it was a different line ending (\r\n or CRLF) that was the problem.

Here's my answer, which would handle multiple semicolons and line endings in either format (may or may not be desired)

Scanner.useDelimiter(Pattern.compile("([\n;]|(\r\n))+"));

e.g. an input file that looks like this:

1


2;3;;4
5

would result in 1,2,3,4,5

I tried normal \n and \\n - both worked in my case, though I agree if you need a normal backslash you would want to double it as it is an escape character. It just so happens that in this case, "\n" becomes the desired character with or without the extra '\'

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文