如何预测非线性脚本的运行时间?
我用 python 编写了这个简单的代码来计算给定数量的素数。
我想问的问题是我是否可以编写一个脚本来计算执行此操作需要多长时间(以处理器周期为单位)?如果是的话怎么办?
primes = [2]
pstep = 3
count = 1
def ifprime (a):
""" Checking if the passed number is prime or not"""
global primes
for check in primes:
if (a%check) == 0:
return False
return True
while 1000000000>= count:
if ifprime(pstep):
primes.append (pstep)
print pstep
count += 1
pstep += 1
这个问题的有趣之处在于,在 x 个增量周期之后是否找到素数几乎是无法预测的。此外,在这种情况下会发生递归,因为“prime”列表越大,执行该函数所需的时间就越长。
有什么建议吗?
I wrote this simple code in python to calculate a given number of primes.
The question I want to ask is whether or not it's possible for me to write a script that calculates how long it will take, in terms of processor cycles, to execute this? If yes then how?
primes = [2]
pstep = 3
count = 1
def ifprime (a):
""" Checking if the passed number is prime or not"""
global primes
for check in primes:
if (a%check) == 0:
return False
return True
while 1000000000>= count:
if ifprime(pstep):
primes.append (pstep)
print pstep
count += 1
pstep += 1
The interesting thing about this problem is that whether or not I find primes after x cycles of incrementation is something nearly impossible to predict. Moreover, there's recursion happening in this scenario since the larger 'prime' list grow the longer it will take to execute this function.
Any tips?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
当您调用 isprime(pstep) 时,如果您有素数,则您将循环 pstep * ln(pstep) 次,其概率为 1/ln(pstep)。因此测试素数的成本与步长成正比。测试复合材料的成本未知,因为我们不知道 2 到 N 之间的复合材料的平均最低因子。如果我们忽略它,假设它由素数的成本主导,我们会得到 SUM 的总成本(pstep) pstep = 3 到 N+3,大约与 N**2 成正比。
您可以通过在检查 > > 时切断 isprime() 中的循环来将其减少到 N**1.5。开方(a)。
When you call isprime(pstep) you are looping pstep * ln(pstep) times, if you have a prime, of which the probability is 1/ln(pstep). So the cost of testing the primes is proportional to step. Unknown is the cost of testing the composites, because we don't know the average lowest factor of the composites between 2 and N. If we ignore it, assuming it is dominated by the cost for the primes, we get a total cost of SUM(pstep) for pstep = 3 to N+3, which is about proportional to N**2.
You can reduce this to N**1.5 by cutting off the loop in isprime() when checked > sqrt(a).
嗯,理论计算机科学有一个很大的分支——复杂性理论——专门致力于解决这类问题。这里遇到的一般问题(决定代码是否会完成任意输入)是所谓的“NP 完全”,因此非常困难。
但在这种情况下,您可能有两种选择。
第一种是使用蛮力。对
a=1, 2, 3, 4, ...
运行 timeit for isprime(a),绘制时间图,并尝试看看它是否看起来很明显:a^2
、a log a
等等。正确但更困难的答案是分析您的算法,看看您是否可以计算出“典型情况”需要多少次操作。
Well, there is a large branch of theoretical computer science -- complexity theory -- dedicated to just this sort of problem. The general problem (of deciding on whether a code will finish for arbitrary input) you have here is what is called "NP-complete" and is therefore very hard.
But in this case you probably have two options.
The first is to use brute force. Run timeit for isprime(a) for
a=1, 2, 3, 4, ...
, plot the graph of the times, and try to see if it looks like something obvious:a^2
,a log a
, whatever.The right -- but harder -- answer is to analyze your algorithm and see if you can work out how many operations it takes for a "typical case".
好吧,如果你在 Linux 上,你可以使用“time”命令,然后解析它的结果。
对于你的问题,我会对 1000 个不同大小的大素数进行计时,并绘制一个图表,这样就很容易分析。
Well, if you are on linux you can use 'time' command and then parse it's result.
For your problem I would do the timing for 1000s of large primes of different size and would draw a chart, so it would be easy to analize.
如果您想预测任意进程完成所需的时间,您不能这样做,因为这基本上是 停止问题。在特殊情况下,您可以估计脚本将花费的时间,例如,如果您知道它是以不允许循环的方式生成的。
在查找素数的特殊情况下,猜测运行该过程之前所需的时间甚至更难,因为间隔内素数的数量只有下限,但这无助于找到它们。
If you want to predict the time an arbitrary process needs until it is finished, you can't do that, as that is basically the problem behind the Halting Problem. In special cases you can estimate the time your script will take, for example if you know that it is generated in a way that doesn't allow loops.
In your special case of finding primes, it is even harder to guess the time it will take before running the process, as there is only a lower bound for the number of primes within an intervall, but that doesn't help finding them.
我认为你必须使用素数分布的近似值,即 PNT 其中(我认为)指出,在 1 和 x 之间,您将大约有
x/ln(x)
素数(ln 是自然对数)。因此,根据对单次迭代所需时间的粗略估计,您应该能够创建一个估计。您的列表中大约有 x/ln(x) 个素数。您的主代码块(在 while 循环内)具有恒定时间(有效)...所以:
t(x) ~ x/ln(x) * a + b + t(x-1)
其中 t(x ) 是迭代 x 所花费的时间(包括迭代 x),
a
是检查列表中每个素数所花费的时间(模运算),b
是主循环的“恒定”时间。我依稀记得有一种方法可以将此类递归函数转换为线性函数;)I think you would have to use an approximation of the distribution of primes, a la PNT which (I think) states that between 1 and x you'll have approximately
x/ln(x)
primes (ln being natural log). So given rough estimates of the time taken for a single iteration, you should be able to create an estimate.You have approximately x/ln(x) primes in your list. Your main code block (inside the while loop) has constant time (effectively)...so:
t(x) ~ x/ln(x) * a + b + t(x-1)
where
t(x)
is the time taken up to and including iteration x,a
is the time taken to check each prime in the list (modulous operation), andb
is the 'constant' time of the main loop. I faintly remember there is a way to convert such recursive functions to linear ones ;)