如何在java中扫描文本时删除空格

发布于 2024-08-21 23:24:16 字数 187 浏览 12 评论 0原文

我在java中实现了几种不同的“扫描仪”，从 Scanner 类到简单地使用，

String.split("\ss+")

但是当连续有几个空格时，例如 "the_quick____brown___fox" 它们都会标记某些空格（想象一下下划线是空格）。有什么建议吗？

原文

I've implemented several different "scanners" in java, from the Scanner class to simply using

String.split("\ss+")

but when there are several whitespaces in a row like "the_quick____brown___fox" they all tokenize certain white spaces (Imagine the underscores are whitespaces). Any suggestions?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

醉南桥 2024-08-28 23:24:16

我不确定你在说什么。例如，

String[] parts = "the quick    brown   fox".split("\\s+");

正确标记字符串，任何标记上都没有前导或尾随空格，并且没有空标记。如果输入字符串可能有前导或尾随空格，则调用 String.trim() 将消除出现空标记的可能性。

编辑我从您的其他评论中推测您正在一次读取一行输入，然后对这些行进行标记。您可能需要在标记化之前修剪每一行。

I'm not sure what you are talking about. For example,

String[] parts = "the quick    brown   fox".split("\\s+");

correctly tokenizes the string with no leading or trailing whitespaces on any token, and no empty tokens. If the input string may have leading or trailing whitespaces, then calling String.trim() will remove the possibility of empty tokens.

EDIT I surmise from your other comment that you are reading the input a line at a time and then tokenizing the lines. You probably need to trim each line before tokenizing.

回复收藏 0 原文