我的 python 程序比同一程序的 java 版本执行得更快。 是什么赋予了?
更新时间:2009-05-29
感谢大家的建议和建议。 我根据您的建议使我的生产代码的执行速度比几天前的最佳结果平均快 2.5 倍。最终我能够使 java 代码成为最快的。
经验教训:
下面的示例代码显示了原始整数的插入,但生产代码实际上是存储字符串(我的错)。 当我更正 python 执行时间从 2.8 秒到 9.6 秒时。 因此,java 在存储对象时实际上速度更快。
但它并不止于此。 我一直在执行java程序如下:
java -Xmx1024m SpeedTest
但是,如果您按如下方式设置初始堆大小,您将获得巨大的改进:
java -Xms1024m -Xmx1024m SpeedTest
这个简单的更改将执行时间减少了 50% 以上。 所以我的 SpeedTest 的最终结果是 python 9.6 秒。 Java 6.5 秒。
原始问题:
我有以下 python 代码:
import time
import sys
def main(args):
iterations = 10000000
counts = set()
startTime = time.time();
for i in range(0, iterations):
counts.add(i)
totalTime = time.time() - startTime
print 'total time =',totalTime
print len(counts)
if __name__ == "__main__":
main(sys.argv)
它在我的机器上执行了大约 3.3 秒,但我想让它更快,所以我决定用 java 对其进行编程。 我认为因为 java 是经过编译的并且通常被认为比 python 更快,所以我会看到一些巨大的回报。
这是 java 代码:
import java.util.*;
class SpeedTest
{
public static void main(String[] args)
{
long startTime;
long totalTime;
int iterations = 10000000;
HashSet counts = new HashSet((2*iterations), 0.75f);
startTime = System.currentTimeMillis();
for(int i=0; i<iterations; i++)
{
counts.add(i);
}
totalTime = System.currentTimeMillis() - startTime;
System.out.println("TOTAL TIME = "+( totalTime/1000f) );
System.out.println(counts.size());
}
}
所以这个 java 代码基本上与 python 代码做同样的事情。 但它执行了 8.3 秒,而不是 3.3 秒。
我从现实世界的例子中提取了这个简单的例子来简化事情。 关键要素是我有(set 或 hashSet),最终有很多成员,就像示例一样。
这是我的问题:
为什么我的 python 实现比我的 java 实现更快?
是否有比 hashSet (java) 更好的数据结构来保存唯一集合?
什么会让 python 实现更快?
什么会使 java 实现更快?
更新:
感谢迄今为止所有做出贡献的人。 请允许我添加一些细节。
我没有包含我的生产代码,因为它非常复杂。 并且会产生很多干扰。 我上面介绍的情况是最简单的情况。 我的意思是,java put 调用似乎比 python set 的 add() 慢得多。
生产代码的 java 实现也比 python 版本慢约 2.5 - 3 倍——就像上面一样。
我不关心虚拟机预热或启动开销。 我只想将 startTime 和 TotalTime 中的代码进行比较。 请不要关心其他事情。
我用足够多的存储桶初始化了哈希集,这样它就不必重新哈希。 (我总是会提前知道集合最终将包含多少个元素。)我想有人可能会说我应该将其初始化为 iterations/0.75。 但如果您尝试一下,您会发现执行时间并没有受到显着影响。
我为那些好奇的人设置了 Xmx1024m(我的机器有 4GB 内存)。
我正在使用 java 版本:Java(TM) SE 运行时环境(内部版本 1.6.0_13-b03)。
在生产版本中,我在 hashSet 中存储一个字符串(2-15 个字符),因此我无法使用原语,尽管这是一个有趣的情况。
我已经运行该代码很多很多次了。 我非常有信心 python 代码比 java 代码快 2.5 到 3 倍。
Update: 2009-05-29
Thanks for all the suggestions and advice. I used your suggestions to make my production code execute 2.5 times faster on average than my best result a couple of days ago. In the end I was able to make the java code the fastest.
Lessons:
My example code below shows the insertion of primitive ints but the production code is actually storing strings (my bad). When I corrected that the python execution time went from 2.8 seconds to 9.6. So right off the bat, the java was actually faster when storing objects.
But it doesn't stop there. I had been executing the java program as follows:
java -Xmx1024m SpeedTest
But if you set the initial heap size as follows you get a huge improvement:
java -Xms1024m -Xmx1024m SpeedTest
This simple change reduced the execution time by more than 50%. So the final result for my SpeedTest is python 9.6 seconds. Java 6.5 seconds.
Original Question:
I had the following python code:
import time
import sys
def main(args):
iterations = 10000000
counts = set()
startTime = time.time();
for i in range(0, iterations):
counts.add(i)
totalTime = time.time() - startTime
print 'total time =',totalTime
print len(counts)
if __name__ == "__main__":
main(sys.argv)
And it executed in about 3.3 seconds on my machine but I wanted to make it faster so I decided to program it in java. I assumed that because java is compiled and is generally considered to be faster than python I would see some big paybacks.
Here is the java code:
import java.util.*;
class SpeedTest
{
public static void main(String[] args)
{
long startTime;
long totalTime;
int iterations = 10000000;
HashSet counts = new HashSet((2*iterations), 0.75f);
startTime = System.currentTimeMillis();
for(int i=0; i<iterations; i++)
{
counts.add(i);
}
totalTime = System.currentTimeMillis() - startTime;
System.out.println("TOTAL TIME = "+( totalTime/1000f) );
System.out.println(counts.size());
}
}
So this java code does basically the same thing as the python code. But it executed in 8.3 seconds instead of 3.3.
I have extracted this simple example from a real-world example to simplify things. The critical element is that I have (set or hashSet) that ends up with a lot of members much like the example.
Here are my questions:
How come my python implementation is faster than my java implementation?
Is there a better data structure to use than the hashSet (java) to hold a unique collection?
What would make the python implementation faster?
What would make the java implementation faster?
UPDATE:
Thanks to all who have contributed so far. Please allow me to add some details.
I have not included my production code because it is quite complex. And would generate a lot of distraction. The case I present above is the most simplified possible. By that I mean that the java put call seems to be much slower than the python set`s add().
The java implementation of the production code is also about 2.5 - 3 times slower than the python version -- just like the above.
I am not concerned about vm warmup or startup overhead. I just want to compare the code from my startTime to my totalTime. Please do not concern yourselves with other matters.
I initialized the hashset with more than enough buckets so that it should never have to rehash. (I will always know ahead of time how many elements the collection will ultimately contain.) I suppose one could argue that I should have initialized it to iterations/0.75. But if you try it you will see that execution time is not significantly impacted.
I set Xmx1024m for those that were curious (my machine has 4GB of ram).
I am using java version: Java(TM) SE Runtime Environment (build 1.6.0_13-b03).
In the production version of I am storing a string (2-15 chars) in the hashSet so I cannot use primitives, although that is an interesting case.
I have run the code many, many times. I have very high confidence that the python code is between 2.5 and 3 times faster than the java code.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(20)
您并不是真正测试 Java 与 Python,而是使用自动装箱的整数与 Python 的本机集合和整数处理来测试
java.util.HashSet
。显然,在这个特定的微基准测试中,Python 端确实更快。
我尝试用 GNU trove 中的
TIntHashSet
替换 HashSet,并实现了 3 和 3 之间的加速因子4、Java 稍微领先于 Python。真正的问题是您的示例代码是否真的像您想象的那样代表您的应用程序代码。 您是否运行过分析器并确定大部分 CPU 时间都花在将大量整数放入 HashSet 中? 如果不是,这个例子就没有意义。 即使唯一的区别是您的生产代码存储除整数之外的其他对象,它们的创建和哈希码的计算也可以轻松地主导集合插入(并完全破坏Python在专门处理整数方面的优势),使整个问题变得毫无意义。
You're not really testing Java vs. Python, you're testing
java.util.HashSet
using autoboxed Integers vs. Python's native set and integer handling.Apparently, the Python side in this particular microbenchmark is indeed faster.
I tried replacing HashSet with
TIntHashSet
from GNU trove and achieved a speedup factor between 3 and 4, bringing Java slightly ahead of Python.The real question is whether your example code is really as representative of your application code as you think. Have you run a profiler and determined that most of the CPU time is spent in putting a huge number of ints into a HashSet? If not, the example is irrelevant. Even if the only difference is that your production code stores other objects than ints, their creation and the computation of their hashcode could easily dominate the set insertion (and totally destroy Python's advantage in handling ints specially), making this whole question pointless.
我怀疑Python使用整数值本身作为其散列,并且基于散列表的set实现直接使用该值。 来自源:
这个微基准测试在某种程度上是 Python 的最佳案例,因为它导致哈希冲突完全为零。 然而,如果 Java HashSet 重新散列键,它必须执行额外的工作,并且还会因冲突而导致更糟糕的行为。
如果将范围(迭代)存储在临时变量中并在循环之前对其执行 random.shuffle,则即使随机播放和列表创建是在循环外完成的,运行时间也会慢 2 倍以上。
I suspect that is that Python uses the integer value itself as its hash and the hashtable based implementation of set uses that value directly. From the comments in the source:
This microbenchmark is somewhat of a best case for Python because it results in exactly zero hash collisions. Whereas, if Javas HashSet is rehashing the keys it has to perform the additional work and also gets much worse behavior with collisions.
If you store the range(iterations) in a temporary variable and do a random.shuffle on it before the loop the runtime is more than 2x slower even if the shuffle and list creation is done outside the loop.
另一种可能的解释是,Python 中的集合是用 C 代码原生实现的,而 Java 中的 HashSet 是用 Java 本身实现的。 因此,Python 中的集合本质上应该更快。
Another possible explanation is that sets in Python are implemented natively in C code, while HashSet's in Java are implemented in Java itself. So, sets in Python should be inherently much faster.
根据我的经验,Python 程序比 Java 程序运行得更快,尽管 Java 是一种“低级”语言。 顺便说一句,这两种语言都被编译成字节代码(这就是那些 .pyc 文件——您可以将它们视为类似于 .class 文件)。 两种语言都是在虚拟堆栈机上解释的字节码。
您可能会认为 Python 在处理诸如
ab
之类的事情时会更慢。 在 java 中,ab
将解析为取消引用。 另一方面,Python 必须执行一个或多个哈希表查找:检查本地作用域、检查模块作用域、检查全局作用域、检查内置函数。另一方面,java 在某些操作上是出了名的糟糕,比如对象创建(这可能是你的例子中的罪魁祸首)和序列化。
总而言之,没有简单的答案。 我不希望任何一种语言对于所有代码示例都更快。
更正:有几个人指出 java 在对象创建方面不再那么糟糕了。 所以,在你的例子中,这是另一回事。 也许自动装箱成本很高,也许在这种情况下 python 的默认哈希算法更好。 根据我的实践经验,当我将 java 代码重写为 python 时,我总是看到性能的提高,但这可能是由于语言的原因,也可能是由于重写通常会带来性能的提高。
It has generally been my experience that python programs run faster than java programs, despite the fact that java is a bit "lower level" language. Incidently, both languages are compiled into byte code (that's what those .pyc file are -- you can think of them as kind of like .class files). Both languages are byte-code interpreted on a virtual stack machine.
You would expect python to be slower at things like, for example,
a.b
. In java, thata.b
will resolve into a dereference. Python, on the other hand, has to do one or more hash table lookups: check the local scope, check the module scope, check global scope, check builtins.On the other hand, java is notoriously bad at certain operations such as object creation (which is probably the culprit in your example) and serialization.
In summary, there's no simple answer. I wouldn't expect either language to be faster for all code examples.
Correction: several people have pointed out that java isn't so bad at object creation any more. So, in your example, it's something else. Perhaps it's autoboxing that's expensive, perhaps python's default hashing algorithm is better in this case. In my practical experience, when I rewrite java code to python, I always see a performance increase, but that could be as much due to the language as it is due to rewritng in general leads to performance improvements.
我想澄清我在答案中看到的一些误解:
是的,Java 被编译为字节码,但最终在大多数运行时环境中编译为本机代码。 那些说 C 本质上更快的人并没有讲述整个故事,我可以说字节编译语言本质上更快,因为 JIT 编译器可以进行特定于机器的优化,而这是超前编译器无法实现的。
许多可能造成差异的因素是:
I'd like to dispel a couple myths I saw in the answers:
Java is compiled, yes, to bytecode but ultimately to native code in most runtime environments. People who say C is inherently faster aren't telling the entire story, I could make a case that byte compiled languages are inherently faster because the JIT compiler can make machine-specific optimizations that are unavailable to way-ahead-of-time compilers.
A number of things that could make the differences are:
编辑:对于实际用例,TreeSet 可能会更快,具体取决于分配模式。 我下面的评论仅涉及这种简化的情况。 然而,我不认为这会产生很大的影响。 真正的问题在其他地方。
这里有几个人建议用 TreeSet 替换 HashSet。 这对我来说听起来是一个非常奇怪的建议,因为插入时间为 O(log n) 的数据结构不可能比预先分配足够的存储桶来存储所有元素的 O(1) 结构更快。
下面是一些对此进行基准测试的代码:
这是我的机器上的结果:
有几个人还认为装箱不是性能问题,并且对象创建成本低廉。 虽然对象创建确实很快,但它绝对不如基元快:
结果:
此外,创建大量对象会导致垃圾收集器产生额外开销。 当您开始在内存中保存数千万个活动对象时,这一点就变得很重要。
Edit: A TreeSet might be faster for the real use case, depending on allocation patterns. My comments below deals only with this simplified scenario. However, I do not believe that it would make a very significant difference. The real issue lays elsewhere.
Several people here has recommended replacing the HashSet with a TreeSet. This sounds like a very strange advice to me, since there's no way that a data structure with O(log n) insertion time is faster then a O(1) structure that preallocates enough buckets to store all the elements.
Here's some code to benchmark this:
And here's the result on my machine:
Several people also argued that boxing isn't a performance issue and that object creation is inexpensive. While it's true that object creation is fast, it's definitely not as fast as primitives:
Result:
Moreover, creating a lot of objects results in additional overhead from the garbage collector. This becomes significant when you start keeping tens of millions of live objects in memory.
我发现这样的基准毫无意义。 我不解决看起来像测试用例的问题。 这并不是很有趣。
我更愿意看到使用 NumPy 和 JAMA 获得有意义的线性代数解决方案的解决方案。 也许我会尝试一下并报告结果。
I find benchmarks like this to be meaningless. I don't solve problems that look like the test case. It's not terribly interesting.
I'd much rather see a solution for a meaningful linear algebra solution using NumPy and JAMA. Maybe I'll try it and report back with results.
我对 python 不太熟悉,但我知道
HashSet
不能包含基元,所以当你说counts.add(i)
时i< /code> 自动装箱到
new Integer(i)
调用中。 这可能是你的问题。如果由于某种原因你确实需要一个 0 到某个大 n 之间的整数“集合”,那么最好将其声明为“boolean[] set = new boolean[n]”。 然后,您可以遍历数组并将集合中的项目标记为“true”,而不会产生创建 n 个 Integer 包装对象的开销。 如果您想更进一步,您可以使用大小为 n/8 的 byte[] 并直接使用各个位。 或者也许是 BigInteger。
编辑
停止投票我的答案。 这是不对的。
编辑
不,真的,这是错误的。 如果我按照问题的建议操作,用 N 个整数填充集合,我会获得类似的性能。 如果我将 for 循环的内容替换为:
那么只需要 2 秒。 如果您根本不存储 Integer,则需要不到 200 毫秒。 强制分配 10000000 个 Integer 对象确实需要一些时间,但看起来大部分时间都花在 HashSet put 操作中。
I'm not too familiar with python, but I do know
HashSet
can't contain primitives, so when you saycounts.add(i)
thei
there is getting autoboxed into anew Integer(i)
call. That's probably your problem.If for some reason you really needed a 'set' of integers between 0 and some large n, its probably best declared as an 'boolean[] set = new boolean[n]'. Then you could go through the array and mark items that are in the set as 'true' without incurring the overhead of creating n Integer wrapper objects. If you wanted to go further than that you could use a byte[] of size n/8 and use the individual bits directly. Or perhaps BigInteger.
EDIT
Stop voting my answer up. Its wrong.
EDIT
No really, its wrong. I get comparable performance if I do what the question suggest, populate the set with N Integers. if I replace the contents of the for loop with this:
Then it only takes 2 seconds. If you don't store the Integer at all then it takes less than 200 millis. Forcing the allocation of 10000000 Integer objects does take some time, but it looks like most of the time is spent inside the HashSet put operation.
这里有很多问题,我想把它们放在一起。
首先,如果它是一个只运行一次的程序,那么多花几秒钟有什么关系吗?
其次,这只是一个微基准测试。 微基准测试对于比较性能毫无意义。
启动有很多问题。
Java 运行时比 Python 大得多,因此从磁盘加载需要更长的时间并占用更多内存,如果您要进行交换,这可能很重要。
如果您尚未设置
-Xms
,您可能只是为了调整堆大小而运行 GC。 最好一开始就调整堆的大小。确实,Java 首先是解释,然后是编译。 Sun 客户端 [C1] Hotspot 大约有 1,500 次迭代,服务器 [C2] 大约有 10,000 次迭代。 服务器热点最终将为您提供更好的性能,但会占用更多内存。 我们可能会看到客户端 Hotspot 使用服务器来执行非常频繁的代码,以获得两全其美的效果。 然而,这通常不应该是几秒钟的问题。
最重要的是,每次迭代您可能会创建两个对象。 对于大多数代码,您不会为如此比例的执行创建这些微小对象。 TreeSet 在对象数量方面可能会更好,6u14 和 Harmony 甚至更好。
Python 可能通过在引用中存储小整数对象而不是实际拥有一个对象来获胜。 这无疑是一个很好的优化。
许多基准测试的一个问题是您在一种方法中混合了许多不同的代码。 你不会这样编写你关心的代码,对吗? 那么,为什么您要尝试进行与您实际希望快速运行的代码不同的性能测试呢?
更好的数据结构:像
BitSet
这样的东西似乎是有意义的(尽管它具有同步性,这可能会也可能不会影响性能)。There's a number of issues here which I'd like to bring together.
First if it's a program that you are only going to run once, does it matter it takes an extra few seconds?
Secondly, this is just one microbenchmarks. Microbenchmarks are pointless for comparing performance.
Startup has a number of issues.
The Java runtime is much bigger than Python so takes longer to load from disk and takes up more memory which may be important if you are swapping.
If you haven't set
-Xms
you may be running the GC only to resize the heap. Might as well have the heap properly sized at the start.It is true that Java starts off interpreting and then compiles. Around 1,500 iterations for Sun client [C1] Hotspot and 10,000 for server [C2]. Server Hotspot will give you better performance eventually, but take more memory. We may see client Hotspot use server for very frequently executed code for best of both worlds. However, this should not usually be a question of seconds.
Most importantly you may be creating two objects per iteration. For most code, you wouldn't be creating these tiny objects for such a proportion of the execution. TreeSet may be better on number of objects score, with 6u14 and Harmony getting even better.
Python may possibly be winning by storing small integer objects in references instead of actually have an object. That is undoubtably a good optimisation.
A problem with a lot of benchmarks is you are mixing a lot of different code up in one method. You wouldn't write code you cared about like that, would you? So why are you attempting to performance test which is unlike code you would actually like to run fast?
Better data structure: Something like
BitSet
would seem to make sense (although that has ynchronisation on it, which may or may not impact performance).您需要多次运行它才能真正了解每次运行的“速度”。 JVM 启动时间[对于一个]正在增加 Java 版本的单次运行时间。
您还创建了一个具有较大初始容量的 HashSet,这意味着将使用许多可用槽创建支持 HashMap,这与您创建基本 Set 的 Python 不同。 很难说这是否会阻碍,因为当你的 HashSet 增长时,它必须重新分配存储的对象。
You need to run it multiple times to get a real idea of "how fast" each runs. The JVM startup time [for one] is adding to the single running time of the Java version.
You're also creating a HashSet with a large initial capacity, which means the backing HashMap will be created with that many available slots, unlike the Python where you create a basic Set. Hard to tell if that would hinder though, as when your HashSet grows it will have to reallocate the stored objects.
您是否在 jvm 中使用 -server 标志? 没有它你就无法测试性能。 (您还必须在进行测试之前预热 jvm。)
此外,您可能想要使用
TreeSet
。 从长远来看,HashSet 会变慢。你使用哪个jvm? 希望是最新的。
编辑
当我说使用 TreeSet 时,我的意思是一般情况下,而不是用于此基准测试。 TreeSet 处理现实世界中对象不均匀散列的问题。 如果在 HashSet 的同一个 bin 中获取太多对象,则执行的时间复杂度约为 O(n)。
Are you using the -server flag with the jvm? You can't test for performance without it. (You also have to warm up the jvm before doing the test.)
Also, you probably want to use
TreeSet<Integer>
. HashSet will be slower in the long run.And which jvm are you using? The newest I hope.
EDIT
When I say use TreeSet, I mean in general, not for this benchmark. TreeSet handles the real world issue of non even hashing of objects. If you get too many objects in the same bin in a HashSet, you will perform about O(n).
如果您确实想在集合中存储基本类型并对其进行大量工作,请在 Java 中推出您自己的集合。 通用类对于科学计算来说速度不够快。
正如 Ants Aasma 提到的,Python 绕过哈希并直接使用整数。 Java 创建一个 Integer 对象(自动装箱) ,然后将其转换为对象(在您的实现中)。 该对象也必须经过哈希处理才能在哈希集中使用。
要进行有趣的比较,请尝试以下操作:
Java
结果:
Python
结果:
“python 比 java 更快”怎么样?
If you really want to store primitive types in a set, and do heavy work on it, roll your own set in Java. The generic classes are not fast enough for scientific computing.
As Ants Aasma mentions, Python bypasses hashing and uses the integer directly. Java creates an Integer object (autoboxing), and then casts it to an Object (in your implementation). This object must be hashed, as well, for use in a hash set.
For a fun comparision, try this:
Java
Results:
Python
Results:
How's that for 'python is faster than java'?
你用多少内存启动 JVM? 这取决于? 当我使用 1 Gig RAM 的程序运行 JVM 时:
如果我使用较少的内存运行 JVM,则需要更长的时间……相当长的时间:
我认为
HashSet
是此特定中的性能瓶颈实例。 如果我用LinkedList
替换HashSet
,程序会变得更快。最后——请注意,Java 程序最初是被解释的,只有那些被多次调用的方法才会被编译。 因此,您可能将 Python 与 Java 的解释器进行比较,而不是编译器。
How much memory did you start the JVM with? It depends? When I run the JVM with your program with 1 Gig of RAM:
If I run the JVM with less memory, it takes longer ... considerably longer:
I think the
HashSet
is the performance bottleneck in this particular instance. If I replace theHashSet
with aLinkedList
, the program gets substantially faster.Finally -- note that Java programs are initially interpreted and only those methods that are called many times are compiled. Thus, you're probably comparing Python to Java's interpreter, not the compiler.
这里只是一个摸索,但 Python 正在做一些 Java 可能没有做的优化:
Just a stab in the dark here, but some optimizations that Python is making that Java probably isn't:
为了更快的 Python 进行了一些更改。
之前
之后
现在 一些很好的老作弊和......
A few changes for faster Python.
Before
After
Now some good old cheating and ...
好吧,如果您要调优 Java 程序,那么您不妨也调优 Python 程序。
仅使用
xrange
就可以在我的机器上提高大约 8% 的速度。 表达式set(xrange(10000000))
构建完全相同的集合,但速度快 2.5 倍(从 1.87 秒缩短到 0.74 秒)。我喜欢调整 Python 程序使其变得更短。 :) 但 Java 也能做到同样的效果。 众所周知,如果你想要Java中的一组密集的小整数,你就不会使用哈希表。 您使用
java.util.BitSet
!那应该相当快。 不幸的是我现在没有时间测试它。
Well, if you're going to tune the Java program, you might as well tune the Python program too.
Just using
xrange
makes it about 8% faster on my machine. And the expressionset(xrange(10000000))
builds exactly the same set, but 2.5x faster (from 1.87 seconds to 0.74).I like how tuning a Python program makes it shorter. :) But Java can do the same trick. As everyone knows, if you want a dense set of smallish integers in Java, you don't use a hash table. You use a
java.util.BitSet
!That should be fairly quick. Unfortunately I don't have the time to test it right now.
我同意甘道夫关于启动时间的看法。 另外,您正在分配一个巨大的 HashSet,它与您的 python 代码完全不相似。 我想如果你把它放在分析器下,就会花很多时间在那里。 此外,在这种大小下插入新元素确实会很慢。 我会按照建议研究 TreeSet 。
I agree with Gandalf about the startup time. Also, you are allocating a huge HashSet which is not at all similar to your python code. I imaging if you put this under a profiler, a good chunk of time would be spent there. Also, inserting new elements is really going to be slow with this size. I would look into TreeSet as suggested.
最大的问题可能是给定的代码测量的是 wall time - 您的手表测量的-- 但是应该测量比较代码运行时间的是 处理时间 -- 时间量CPU 花费在执行该特定代码而不是其他任务上。
The biggest issue is probably that what the given code measures is wall time -- what your watch measures -- but what should be measured to compare code runtime is process time -- the amount of time the cpu spend executing that particular code and not other tasks.
只需添加一个简单的额外功能,您就可以使 Java microbenchamrk 变得更快。
变得
简单、更快并得到相同的结果。
You can make the Java microbenchamrk much faster, by adding just a simple little extra.
becomes
Simple, faster and gets the same result.
您可能想看看是否可以“启动”JIT 编译器来编译您感兴趣的代码部分,方法可能是预先将其作为函数运行一次,然后短暂休眠。 这可能允许 JVM 将函数编译为本机代码。
You might want to see if you can "prime" the JIT compiler into compiling the section of code you're interested in, by perhaps running it as a function once beforehand and sleeping briefly afterwords. This might allow the JVM to compile the function down to native code.