Haskell 中的随机枢轴快速排序

发布于 2024-10-20 19:20:55 字数 551 浏览 13 评论 0原文

是否有可能在 Haskell 中实现快速排序（使用 RANDOM-PIVOT），但仍然有一个简单的 Ord a =>; [a]->[a] 签名？

我开始了解 Monad，目前，我将 monad 解释为某种“命令模式”，这对于 IO 非常有用。

所以，我知道返回随机数的函数实际上应该返回像 IO 这样的一元值，因为否则它会破坏引用透明度。我还明白，应该没有办法从返回的一元值中“提取”随机整数，因为否则，它会再次破坏引用透明度。

但是，我仍然认为应该可以实现“纯”[a]->[a] 快速排序函数，即使它使用随机枢轴，因为它是引用透明的。从我的角度来看，随机枢轴只是一个实现细节，不应该改变函数的签名

OBS：我实际上对特定的快速排序问题并不感兴趣（所以，我不想听起来粗鲁，但我'我不是在寻找“使用合并排序”或“随机枢轴在实践中不会提高性能”之类的答案）我实际上对如何实现“纯”感兴趣' 函数内部使用“不纯”函数，在快速排序之类的情况下，我可以确保该函数实际上是一个纯函数。

快速排序只是一个很好的例子。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

南薇 2024-10-27 19:20:55

您做出了错误的假设，即选择枢轴点只是一个实现细节。考虑集合上的部分排序。就像卡上的快速排序一样，其中

卡 a <如果面值较小但要评估布尔值，则卡片 b：

  4 spades < 4 hearts (false)
  4 hearts < 4 spades (false)
  4 hearts = 4 spades (false)

在这种情况下，主元的选择将决定卡片的最终顺序。以完全相同的方式

对于像这样的函数，

a = get random integer  
b = a + 3
print b

由 a 确定。如果您随机选择某些内容，那么您的计算是或可能是不确定的。

You are making a false assumption that picking the pivot point is just an implementation detail. Consider a partial ordering on a set. Like a quicksort on cards where

card a < card b if the face value is less but if you were to evaluate booleans:

  4 spades < 4 hearts (false)
  4 hearts < 4 spades (false)
  4 hearts = 4 spades (false)

In that case the choice of pivots would determine the final ordering of the cards. In precisely the same way

for a function like

a = get random integer  
b = a + 3
print b

is determined by a. If you are randomly choosing something then your computation is or could be non deterministic.

回复收藏 0 原文

旧伤还要旧人安 2024-10-27 19:20:55

好的，看看这个。

选择从 hashable 包复制的部分，以及 voodoo magic language pragmas

{-# LANGUAGE FlexibleInstances, UndecidableInstances, NoMonomorphismRestriction, OverlappingInstances #-}

import System.Random (mkStdGen, next, split)
import Data.List (foldl')
import Data.Bits (shiftL, xor)

class Hashable a where
    hash :: a -> Int

instance (Integral a) => Hashable a where
    hash = fromIntegral

instance Hashable Char where
    hash = fromEnum

instance (Hashable a) => Hashable [a] where
    hash = foldl' combine 0 . map hash

-- ask the authors of the hashable package about this if interested
combine h1 h2 = (h1 + h1 `shiftL` 5) `xor` h2

OK，所以现在我们可以采取任何 Hashable 的列表，并将其转换为 Int。我在这里提供了 Char 和 Integral a 实例，更多更好的实例位于可哈希包中，它还允许加盐等。

这只是我们可以制作一个数字生成器。

genFromHashable = mkStdGen . hash

现在是有趣的部分。让我们编写一个带有随机数生成器、比较器函数和列表的函数。然后，我们将通过咨询生成器来选择一个主元，并咨询比较器来对列表进行分区，从而对列表进行排序。

qSortByGen _ _ [] = []
qSortByGen g f xs = qSortByGen g'' f l ++ mid ++ qSortByGen g''' f r
    where (l, mid, r) = partition (`f` pivot) xs
          pivot = xs !! (pivotLoc `mod` length xs)
          (pivotLoc, g') = next g
          (g'', g''') = split g'

partition f = foldl' step ([],[],[])
    where step (l,mid,r) x = case f x of
              LT -> (x:l,mid,r)
              EQ -> (l,x:mid,r)
              GT -> (l,mid,x:r)

库函数：next从生成器中获取一个Int，并生成一个新的生成器。 split 将生成器分叉为两个不同的生成器。

我的函数：partition使用f :: a ->排序 将列表划分为三个列表。如果你了解折叠的话，应该就很清楚了。（请注意，它不会保留子列表中元素的初始顺序；它会颠倒它们。如果这是一个问题，使用foldr 可以解决这个问题。）qSortByGen 的工作方式就像我之前所说的：查阅枢轴的生成器，对列表进行分区，分叉生成器以在两个递归调用中使用，对左侧和右侧进行递归排序，并将其全部连接在一起。

从这里可以很容易地编写方便的函数，

qSortBy f xs = qSortByGen (genFromHashable xs) f xs
qSort = qSortBy compare

请注意最终函数的签名。

ghci> :t qSort
qSort :: (Ord a, Hashable a) => [a] -> [a]

列表内的类型必须同时实现 Hashable 和 Ord。这是您所要求的“纯”功能，并且有一个合乎逻辑的附加要求。越通用的功能对其要求的限制越少。

ghci> :t qSortBy
qSortBy :: (Hashable a) => (a -> a -> Ordering) -> [a] -> [a]
ghci> :t qSortByGen
qSortByGen
  :: (System.Random.RandomGen t) =>
     t -> (a -> a -> Ordering) -> [a] -> [a]

最终说明

对于所有输入，qSort 的行为方式完全相同。 “随机”主元选择是。事实上，确定性。但它通过散列列表然后播种随机数生成器而变得模糊，使其对我来说足够“随机”。 ;)

qSort 也仅适用于长度小于 maxBound :: Int 的列表，ghci 告诉我是 9,223,372,036,854,775,807。我认为负索引会出现问题，但在我的临时测试中我还没有遇到它。

或者，您可以使用 IO monad 来实现“更真实”的随机性。

qSortIO xs = do g <- getStdGen -- add getStdGen to your imports
                return $ qSortByGen g compare xs


ghci> :t qSortIO
qSortIO :: (Ord a) => [a] -> IO [a]
ghci> qSortIO "Hello world"
" Hdellloorw"
ghci> qSort "Hello world"
" Hdellloorw"

OK, check this out.

Select portions copied form the hashable package, and voodoo magic language pragmas

{-# LANGUAGE FlexibleInstances, UndecidableInstances, NoMonomorphismRestriction, OverlappingInstances #-}

import System.Random (mkStdGen, next, split)
import Data.List (foldl')
import Data.Bits (shiftL, xor)

class Hashable a where
    hash :: a -> Int

instance (Integral a) => Hashable a where
    hash = fromIntegral

instance Hashable Char where
    hash = fromEnum

instance (Hashable a) => Hashable [a] where
    hash = foldl' combine 0 . map hash

-- ask the authors of the hashable package about this if interested
combine h1 h2 = (h1 + h1 `shiftL` 5) `xor` h2

OK, so now we can take a list of anything Hashable and turn it into an Int. I've provided Char and Integral a instances here, more and better instances are in the hashable packge, which also allows salting and stuff.

This is all just so we can make a number generator.

genFromHashable = mkStdGen . hash

So now the fun part. Let's write a function that takes a random number generator, a comparator function, and a list. Then we'll sort the list by consulting the generator to select a pivot, and the comparator to partition the list.

qSortByGen _ _ [] = []
qSortByGen g f xs = qSortByGen g'' f l ++ mid ++ qSortByGen g''' f r
    where (l, mid, r) = partition (`f` pivot) xs
          pivot = xs !! (pivotLoc `mod` length xs)
          (pivotLoc, g') = next g
          (g'', g''') = split g'

partition f = foldl' step ([],[],[])
    where step (l,mid,r) x = case f x of
              LT -> (x:l,mid,r)
              EQ -> (l,x:mid,r)
              GT -> (l,mid,x:r)

Library functions: next grabs an Int from the generator, and produces a new generator. split forks the generator into two distinct generators.

My functions: partition uses f :: a -> Ordering to partition the list into three lists. If you know folds, it should be quite clear. (Note that it does not preserve the initial ordering of the elements in the sublists; it reverses them. Using a foldr could remedy this were it an issue.) qSortByGen works just like I said before: consult the generator for the pivot, partition the list, fork the generator for use in the two recursive calls, recursively sort the left and right sides, and concatenate it all together.

Convenience functions are easy to compose from here

qSortBy f xs = qSortByGen (genFromHashable xs) f xs
qSort = qSortBy compare

Notice the final function's signature.

ghci> :t qSort
qSort :: (Ord a, Hashable a) => [a] -> [a]

The type inside the list must implement both Hashable and Ord. There's the "pure" function you were asking for, with one logical added requirement. The more general functions are less restrictive in their requirements.

ghci> :t qSortBy
qSortBy :: (Hashable a) => (a -> a -> Ordering) -> [a] -> [a]
ghci> :t qSortByGen
qSortByGen
  :: (System.Random.RandomGen t) =>
     t -> (a -> a -> Ordering) -> [a] -> [a]

Final notes

qSort will behave exactly the same way for all inputs. The "random" pivot selection is. in fact, deterministic. But it is obscured by hashing the list and then seeding a random number generator, making it "random" enough for me. ;)

qSort also only works for lists with length less than maxBound :: Int, which ghci tells me is 9,223,372,036,854,775,807. I thought there would be an issue with negative indexes, but in my ad-hoc testing I haven't run into it yet.

Or, you can just live with the IO monad for "truer" randomness.

qSortIO xs = do g <- getStdGen -- add getStdGen to your imports
                return $ qSortByGen g compare xs


ghci> :t qSortIO
qSortIO :: (Ord a) => [a] -> IO [a]
ghci> qSortIO "Hello world"
" Hdellloorw"
ghci> qSort "Hello world"
" Hdellloorw"

回复收藏 0 原文