最好使用正则表达式或 Stringtokenizer 来查找作者和书名:William Faulkner - “Light In August”

发布于 2024-09-27 15:36:06 字数 494 浏览 4 评论 0原文

最好使用 regex 或 Stringtokenizer 来分隔此字符串中的作者和标题:

William Faulkner - 'Light In August'

这是最简单的 regex 吗?

Pattern pattern = Pattern.compile("^\\s*([^-]+)-.*$");
Matcher matcher = pattern.matcher("William Faulkner - 'Light In August'");
String author = matcher.group(1).trim();
String bookTitle = matcher.group(2).trim();

这是多余的还是有更简单的方法来使用 Stringtokenizer 来做到这一点?

基本上,我正在寻找最透明和可维护的解决方案,因为我对 regex 没有很好的理解,并且获得了上述解决方案的帮助。

Is it better to use regex or Stringtokenizer to separate the author and title in this string:

William Faulkner - 'Light In August'

Is this the simplest regex that would work?

Pattern pattern = Pattern.compile("^\\s*([^-]+)-.*$");
Matcher matcher = pattern.matcher("William Faulkner - 'Light In August'");
String author = matcher.group(1).trim();
String bookTitle = matcher.group(2).trim();

Is that overkill or is there a simpler way to do this with a Stringtokenizer?

Basically I'm looking for the most transparent and maintainable solution since I don't have a good understanding of regex and got help with the one above.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

╰◇生如夏花灿烂 2024-10-04 15:36:06

您对输入有多少控制权?您能否保证作者和标题始终由 " - " (空格、破折号和空格)分隔?您确定作者不会包含" - "吗?等等。

如果输入非常严格,那么您可以简单地使用 String#split(),这应该会让你非常清楚你在做什么。 不要使用 StringTokenizer (

StringTokenizer 是一个遗留类,出于兼容性原因而保留,尽管在新代码中不鼓励使用它。建议任何寻求此功能的人使用 String 的 split 方法或 java.util.regex 包。

Mark Byers 的回答向您展示了如何使用 split()

但是,如果您必须担心输入中的更多变化(例如,破折号周围的空白数量是否可变或根本不存在?),那么使用正则表达式将变得简洁明了。权衡是代码的可读性和意图的清晰度。

How much control do you have over the input? Can you guarantee that author and title will always be separated by " - " (a space, a dash, and a space)? Do you know for sure that the author won't contain " - "? And so on.

If the input is quite rigid, then you can simply use String#split(), which should make it very clear what you're doing. Don't use a StringTokenizer (source):

StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.

Mark Byers' answer shows you how to use split().

However, if you have to worry about more variation in the input (e.g., can the whitespace amount of whitespace around the dash be variable or not exist at all?) then using a regex will be terse and concise. The tradeoff then is code readability and clarity of intent.

明月夜 2024-10-04 15:36:06

这取决于输入的样子。例如,您的正则表达式对于包含连字符的作者姓名将失败。

也许类似的东西

Pattern.compile("^\\s*(.*?)\\s+-\\s+'(.*)'\\s*$")

可能更合适一些。

It depends on what the input looks like. Your regex, for example, would fail on author names that contain a hyphen.

Perhaps something like

Pattern.compile("^\\s*(.*?)\\s+-\\s+'(.*)'\\s*$")

might fit a little better.

热风软妹 2024-10-04 15:36:06

使用String.split怎么样?

String s = "William Faulkner - 'Light In August'";
String[] parts = s.split(" - ", 2);
String author = parts[0];
String title = parts[1];

ideone

需要注意的一件事是,某些作者的姓名和书名包含连字符,因此仅用连字符分割就可以了一般来说并不总是有效。

How about using String.split?

String s = "William Faulkner - 'Light In August'";
String[] parts = s.split(" - ", 2);
String author = parts[0];
String title = parts[1];

ideone

One thing to watch out for is that some authors' names and book titles contain hyphens so splitting just on a hyphen won't always work in general.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文