当前位置：文江博客话题详情

Java解析带有大量空格的字符串

发布于 2025-01-05 19:37:13 字数 96 浏览 2 评论 0原文

我有一个包含多个空格的字符串，但是当我使用分词器时，它会在所有这些空格处将其分开。我需要令牌来包含这些空格。如何利用 StringTokenizer 返回带有我要分割的标记的值？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

幽梦紫曦～ 2025-01-12 19:37:13

您会在 StringTokenizer 的文档中注意到，建议不要将其用于任何新代码，并且 String.split(regex) 就是您想要的想要

String foo = "this is      some  data      in   a string";
String[] bar = foo.split("\\s+");

编辑添加：或者，如果您比简单的拆分有更大的需求，则可以使用 Pattern 和 Matcher 类来实现更复杂的正则表达式匹配和提取。

再次编辑：如果您想保留空格，实际上了解一些正则表达式确实有帮助：

String[] bar = foo.split("\\b+");

这将在单词边界上进行分割，将每个单词之间的空格保留为字符串;

public static void main( String[] args )
{
    String foo = "this is      some  data      in   a string";
    String[] bar = foo.split("\\b");
    for (String s : bar)
    {
        System.out.print(s);
        if (s.matches("^\\s+$"))
        {
            System.out.println("\t<< " + s.length() + " spaces");
        }
        else
        {
            System.out.println();
        }
    }
}

输出：

this
        << 1 spaces
is
        << 6 spaces
some
        << 2 spaces
data
        << 6 spaces
in
        << 3 spaces
a
        << 1 spaces
string

You'll note in the docs for the StringTokenizer that it is recommended it shouldn't be used for any new code, and that String.split(regex) is what you want

String foo = "this is      some  data      in   a string";
String[] bar = foo.split("\\s+");

Edit to add: Or, if you have greater needs than a simple split, then use the Pattern and Matcher classes for more complex regular expression matching and extracting.

Edit again: If you want to preserve your space, actually knowing a bit about regular expressions really helps:

String[] bar = foo.split("\\b+");

This will split on word boundaries, preserving the space between each word as a String;

public static void main( String[] args )
{
    String foo = "this is      some  data      in   a string";
    String[] bar = foo.split("\\b");
    for (String s : bar)
    {
        System.out.print(s);
        if (s.matches("^\\s+$"))
        {
            System.out.println("\t<< " + s.length() + " spaces");
        }
        else
        {
            System.out.println();
        }
    }
}

Output:

this
        << 1 spaces
is
        << 6 spaces
some
        << 2 spaces
data
        << 6 spaces
in
        << 3 spaces
a
        << 1 spaces
string

回复收藏 0 原文

触ぅ动初心 2025-01-12 19:37:13

听起来您可能需要使用正则表达式（http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/package-summary.html）而不是 StringTokenizer。

回复收藏 0 原文

妄断弥空 2025-01-12 19:37:13

使用 String.split("\\s+") 而不是 StringTokenizer。

请注意，这只会提取由至少一个空格字符分隔的非空格字符，如果您希望前导/尾随空格字符包含在非空格字符中，那将是一种完全不同的解决方案！

从您最初的问题来看，这一要求并不清楚，并且有一个待处理的编辑试图澄清它。

在几乎所有非人为的情况下，StringTokenizer 都是错误的工具。

回复收藏 0 原文

你げ笑在眉眼 2025-01-12 19:37:13

我认为如果您首先使用 replaceAll 函数将所有多个空格替换为单个空格，然后使用 split 函数进行标记化，那就太好了。

回复收藏 0 原文

~没有更多了~

关于作者

小ぇ时光︴

暂无简介

文章

27 人气

关注发私信

达拉崩吧

文章 0 评论 0

关注

PANGOO

文章 0 评论 0

关注

kkgtx

文章 0 评论 0

关注

WordPress小学生

文章 0 评论 0

关注

酷炫老祖宗

文章 0 评论 0

关注

硪扪都還晓

文章 0 评论 0

友情链接

文江博客

Java解析带有大量空格的字符串

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

达拉崩吧

PANGOO

kkgtx

WordPress小学生

酷炫老祖宗

硪扪都還晓

友情链接

Java解析带有大量空格的字符串

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（4）

关于作者

相关话题

热门标签

推荐作者

达拉崩吧

PANGOO

kkgtx

WordPress小学生

酷炫老祖宗

硪扪都還晓

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。