哈斯克尔 --> F#:特纳筛
当我偶然发现一种埃拉托色尼筛法的改进版本(称为欧拉筛法)时,我正在阅读不同的筛分算法。根据维基百科有一个在 Haskell 中实现了稍微不同的想法版本(称为特纳筛法)。
现在我试图了解给出的代码片段到底是做什么的,我想我已经明白了,但现在我想将代码转换为 F#,但真的不知道从哪里开始。我主要担心的是,似乎没有一个函数可以“减去”两个序列。
代码如下:
import Data.OrdList (minus)
primes = euler [2..]
euler (p : xs) = p : euler (xs `minus` map (*p) (p : xs))
这将如何在 F# 中实现?有可能吗?
I was reading on different sieving algorithms when I stumbled upon a kind of improved version of the Sieve of Eratosthenes called Euler's Sieve. According to Wikipedia there is an implementation of an slightly different version of the idea (called Turner's Sieve) in Haskell.
Now I am trying to understand what exactly the code snippet given does and I think I've got it but now I wanted to translate the code into F# and have really no idea where to start. My main concern is that there doesn't seem to be a function to "substract" two sequences.
Here's the code:
import Data.OrdList (minus)
primes = euler [2..]
euler (p : xs) = p : euler (xs `minus` map (*p) (p : xs))
How would this be implemented in F#? Is it even possible?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
如果您想像 Haskell 那样计算无限列表的合并/差异之类的事情,那么 LazyList 类型(在 F# power pack 中可以找到)就会浮现在脑海中。
它会产生非常冗长的代码,如下面的翻译:
with
请注意,我添加了两个
failwith
子句,以防止编译器抱怨不完整的匹配,即使我们知道计算中的所有列表都是 (懒惰地)无限。If you want to calculate things like merges/differences of infinite lists like Haskell does, the LazyList type (found inside the F# power pack) springs to mind.
It makes for very verbose code, like the translation below:
with
Note that I added two
failwith
clauses to keep the compiler from complaining about an incomplete match, even if we know that all lists in the calculation are (lazily) infinite.您可以使用seq来做到这一点。当你完成减时,欧拉本身与Haskell中的相同。
You can do it with seq. And as you got minus done, euler itself is same as in Haskell.
对于序列,通过持久化序列你会变得更有竞争力:
With sequences, you get a lot more competitive by persisting the sequence:
首先,必须指出的是,欧拉素数筛并不是“埃拉托斯特尼筛法的改进版本”,因为它的性能在任何意义上都比埃拉托斯特尼筛法的任何版本都要差得多: 关于素数算法的 Haskell Wiki - Euler
接下来,应该说 @cfem 使用 LazyList 的代码是忠实的,尽管您提供的欧拉筛版本的详细翻译,尽管它缺乏根据上面的链接仅筛选奇数的轻微改进。
应该注意的是,实现欧拉筛实际上没有任何意义,因为它比通过试除优化 (TDO) 查找素数更复杂且更慢,因为它只对找到的素数进行除法直到根据以下内容测试素数候选数:Haskell Wiki on Prime Number Algorithms - TDO。
此试除优化 (TDO) 筛选可以使用 LazyList(参考 FSharp.PowerPack.dll)在 F# 中实现为:
它可以使用与以下形式相同的序列来实现:
序列版本比 LazyList 版本稍快,因为它避免了调用时的一些开销,因为 LazyList 基于缓存序列。两者都使用一个内部对象,该对象表示迄今为止找到的素数的缓存副本,在 LazyList 的情况下自动缓存,在序列的情况下由 Seq.cache 自动缓存。两者都可以在大约两秒内找到前 100,000 个素数。
现在,欧拉筛可以进行奇数筛分优化,并使用 LazyList 表示如下,由于知道输入列表参数是无限的,并且比较匹配被简化,因此消除了一种匹配情况,如下好吧,我添加了一个运算符“^”以使代码更具可读性:
但是,必须注意的是,对于 primesTDO 和 primesTDOS,计算第 1899 个素数(16381)的时间分别约为 0.2 和 0.16 秒,而使用这个 primesE 大约需要 2.5 秒,即使对于这么小的范围,欧拉筛的性能也很糟糕,时间是其十倍以上。除了糟糕的性能之外,primeE 甚至无法计算 3000 以内的素数,因为它记录的延迟执行函数数量随着找到的素数的增加而迅速增加,因此内存利用率更差。
请注意,在重复编写代码时必须小心,因为 LazyList 是一个值,并且内置了先前找到的元素的记忆,因此第二次相同的扫描将花费接近零的时间;出于计时目的,最好将 PrimeE 设为 PrimeE() 中的函数,以便每次工作都从头开始。
总之,用包括 F# 在内的任何语言实现的欧拉筛只是一种有趣的智力练习,没有实际用途,因为它比几乎所有其他合理优化的筛要慢得多,而且占用内存要严重得多。
First, it must be said that the Euler's sieve for prime numbers is not a "improved version of the Sieve of Eratosthenes" as its performance in every sense is much worse than any version of the Sieve of Eratosthenes: Haskell Wiki on Prime Number algorithms - Euler
Next, it should be said that @cfem's code using LazyList's is a faithful although verbose translation of the version of Euler's sieve that you gave, although it lacks the slight improvement of only sieving odd numbers as per the link above.
It should be noted that there isn't really any point in implementing the Euler sieve, as it is more complex and slower than finding primes by Trial Division Optimized (TDO) as to only doing divisions by found primes up to the square root of the candidate number tested for prime as per: Haskell Wiki on Prime Number algorithms - TDO.
This Trial Division Optimized (TDO) sieve can be implemented in F# using LazyList's (with a reference to FSharp.PowerPack.dll) as:
It can be implemented using sequences in the same form as:
The sequence version is slightly faster than the LazyList version because it avoids some overhead in calling through since LazyList's are based on cached sequences. Both use an internal object which represents a cached copy of the primes found so far, automatically cached in the case of LazyList's, and by the Seq.cache in the case of sequences. Either can find the first 100,000 primes in about two seconds.
Now, the Euler sieve can have the odd number sieving optimization and be expressed using LazyList's as the following, with one match case eliminated due to knowing that the input list parameters are infinite and the compare match simplified, as well I've added an operator '^' to make the code more readable:
However, it must be noted that the time to calculate the 1899th prime (16381) is about 0.2 and 0.16 seconds for the primesTDO and primesTDOS, respectively, while it is about 2.5 seconds using this primesE for a terrible performance for the Euler sieve at over ten times the time even for this small range. In addition to terrible performance, primeE cannot even calculate primes to 3000 due do even worse memory utilization as it records a rapidly increasing number of deferred execution functions with increasing found primes.
Note that one must be careful in repeated timing of the code as written since the LazyList is a value and has built-in memorization of previously found elements, thus a second identical scan will take close to zero time; for timing purposes it might be better to make the PrimeE a function as in PrimeE() so the work starts from the beginning each time.
In summary, the Euler sieve implemented in any language including F# is only an interesting intellectual exercise and has no practical use as it is much slower and hogs memory much worse than just about every other reasonably optimized sieve.