为什么Java中的String类不实现Iterable?
许多 Java 框架类实现了 Iterable,但 String 却没有。迭代 String
中的字符是有意义的,就像迭代常规数组中的项目一样。
String
没有实现 Iterable
是否有原因?
Many Java framework classes implement Iterable
, however String
does not. It makes sense to iterate over characters in a String
, just as one can iterate over items in a regular array.
Is there a reason why String
does not implement Iterable
?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
确实没有一个好的答案。 Java中的迭代器专门应用于离散项(对象)的集合。您可能会认为实现了
CharSequence
的String
应该是离散字符的“集合”。相反,它被视为恰好由字符组成的单个实体。在Java中,迭代器似乎只真正应用于集合而不是字符串。没有理由会这样(据我所知,您可能需要与 Gosling 或 API 编写者交谈);这似乎是惯例或设计决定。事实上,没有什么可以阻止
CharSequence
实现Iterable
。也就是说,您可以像这样迭代字符串中的字符:
Or:
Or:
另请注意,您无法就地修改字符串的字符,因为字符串是不可变的。 String 的可变伴侣是 StringBuilder(或旧的 StringBuffer)。
编辑
根据对此答案的评论进行澄清。我试图解释一个可能的理由,即为什么
String
上没有迭代器。我并不是想说这是不可能的;我并不是说这是不可能的。事实上,我认为CharSequence
实现Iterable
是有意义的。String
提供CharSequence
,仅从概念上讲,它与String
不同。String
通常被认为是单个实体,而CharSequence
正是:字符序列。在字符序列上(即在CharSequence
上)使用迭代器是有意义的,但不仅仅是在String
本身上。正如 Foxfire 在评论中正确指出的那样,
String
实现了CharSequence
接口,因此从类型上来说,String
是一个CharSequence
。从语义上讲,在我看来,它们是两个独立的东西 - 我在这里可能很迂腐,但是当我想到String
时,我通常将其视为恰好由字符组成的单个实体。考虑一下数字序列1, 2, 3, 4
和数字1234
之间的差异。现在考虑字符串abcd
和字符序列a, b, c, d
之间的区别。我试图指出这个差异。在我看来,问为什么
String
没有迭代器就像问为什么Integer
没有迭代器以便您可以迭代各个数字。There really isn't a good answer. An iterator in Java specifically applies to a collection of discrete items (objects). You would think that a
String
, which implementsCharSequence
, should be a "collection" of discrete characters. Instead, it is treated as a single entity that happens to consist of characters.In Java, it seems that iterators are only really applied to collections and not to a string. There is no reason why it is this way (near as I can tell - you would probably have to talk to Gosling or the API writers); it appears to be convention or a design decision. Indeed, there is nothing preventing
CharSequence
from implementingIterable
.That said, you can iterate over the characters in a string like so:
Or:
Or:
Also note that you cannot modify a character of a String in place because Strings are immutable. The mutable companion to a String is StringBuilder (or the older StringBuffer).
EDIT
To clarify based on the comments on this answer. I'm trying to explain a possible rationale as to why there is no Iterator on a
String
. I'm not trying to say that it's not possible; indeed I think it would make sense forCharSequence
to implementIterable
.String
providesCharSequence
, which, if only conceptually, is different from aString
. AString
is usually thought of as a single entity, whereasCharSequence
is exactly that: a sequence of characters. It would make sense to have an iterator on a sequence of characters (i.e., onCharSequence
), but not simply on aString
itself.As Foxfire has rightly pointed out in the comments,
String
implements theCharSequence
interface, so type-wise, aString
is aCharSequence
. Semantically, it seems to me that they are two separate things - I'm probably being pedantic here, but when I think of aString
I usually think of it as a single entity that happens to consist of characters. Consider the difference between the sequence of digits1, 2, 3, 4
and the number1234
. Now consider the difference between the stringabcd
and the sequence of charactersa, b, c, d
. I'm trying to point out this difference.In my opinion, asking why
String
doesn't have an iterator is like asking whyInteger
doesn't have an iterator so that you can iterate over the individual digits.原因很简单:string 类比 Iterable 更古老。
显然没有人愿意将接口添加到 String (这有点奇怪,因为它确实实现了基于完全相同的想法的 CharSequence )。
然而,它会有些性能不佳,因为 Iterable 返回一个对象。所以它必须包装每个返回的 Char。
编辑:就像比较一样:.Net 确实支持对 String 进行枚举,但是在 .Net 中,Iterable 也适用于本机类型,因此不需要像 Java 中那样进行包装。
The reason is simple: The string class is much older than Iterable.
And obviously nobody ever wanted to add the interface to String (which is somewhat strange because it does implement CharSequence which is based on exactly the same idea).
However it would be somewhat imperformant because Iterable returns an object. So it would have to Wrap every Char returned.
Edit: Just as comparison: .Net does support enumerating on String, however in .Net Iterable also works on native types so there is no wrapping required as it would be required in Java.
不管怎样,我的同事 Josh Bloch 强烈希望将此功能添加到 Java 7 中:
for (char c : aString) { ... }
和
for (int codePoint : aString) { ... }
这将是循环字符和逻辑字符(代码点)的最简单方法。它不需要使
String
实现Iterable
,这会强制发生装箱。如果没有这种语言功能,这个问题就不会得到真正好的答案。他似乎对实现这一目标非常乐观,但我不确定。
For what it's worth, my coworker Josh Bloch strongly wishes to add this feature to Java 7:
for (char c : aString) { ... }
and
for (int codePoint : aString) { ... }
This would be the easiest way to loop over chars and over logical characters (code points) ever. It wouldn't require making
String
implementIterable
, which would force boxing to happen.Without that language feature, there's not going to be a really good answer to this problem. And he seems very optimistic that he can get this to happen, but I'm not sure.
让 String 实现 Iterable 的主要原因之一是启用简单的 for(each) 循环,如上所述。因此,不让 String 实现 Iterable 的一个原因可能是幼稚实现固有的低效率,因为它需要对结果进行装箱。但是,如果生成的迭代器(由 String.iterator() 返回)的实现是最终的,则编译器可以对其进行特殊处理并生成免于装箱/拆箱的字节码。
One of the main reasons for making String implement Iterable is to enable the simple for(each) loop, as mentioned above. So, a reason for not making String implement Iterable could be the inherent inefficiency of a naïve implementation, since it requires boxing the result. However, if the implementation of the resulting Iterator (as returned by String.iterator()) is final, the compiler could special-case it and generate byte-code free from boxing/unboxing.
如果您真的很想在这里迭代:
If you are really instrested in iterating here:
他们只是忘记这样做了。
They simply forgot to do so.
我不确定为什么这在 2020 年仍未实现,我的猜测是字符串在 Java 中得到了很多特殊处理(编译器重载了用于字符串连接、字符串文字的
+
运算符,存储在公共池中的字符串常量等),该功能可能比看起来更难实现(或者从实现者的角度来看,它可能会搞乱太多事情,不值得付出努力)。另一方面,实现与此接近的东西并不需要太多工作。我想在我的一个项目中实现这一点,所以我编写了这些简单的类:
有了这些,我可以做到这一点:
现在对于这样一个简单的事情来说,这看起来很多,但它允许将字符串视为可迭代的字符数组并工作透明地使用旨在处理事物集合(列表、集合等)的方法。
I'm not sure why this is still not implemented in 2020, my guess would be that Strings are given a lot of special treatment in Java (with compiler overloading the
+
operator for string concatenation, string literals, string constants stored in a common pool, etc.) that this feature might be harder to implement than it looks (or it might mess up with too many things to be worth the effort from the implementers' point of view).On the other hand, implementing something close to this is not too much work. I wanted this in one of my project, so I wrote these simple classes:
With these I can do this:
Now this looks like a lot for such a simple thing but it then allows strings to be treated like an iterable array of characters and work transparently with methods designed to work on collection of things (lists, sets, etc.).
可迭代
是什么?Iterable
是最有意义的,其中每个元素代表一个 Unicode 代码点。当我们有toCharArray
时,即使是Iterable
也会变得缓慢且毫无意义。Iterable
of what?Iterable<Integer>
would make most sense, where each element represents a Unicode codepoint. EvenIterable<Character>
would be slow and pointless when we havetoCharArray
.