如果我在 Haskell / GHC 中使用未装箱类型(如 Int#),我应该注意哪些事情?

发布于 2024-09-18 05:32:37 字数 2069 浏览 3 评论 0原文

我正在尝试编写一个小脚本来解析和执行 Brainfuck 代码,以了解优化的 GHC 选项,我正在尝试优化代码以便更快一点并了解那里发生了什么。

其中一部分是 BF 代码的内部表示,我为此使用了特殊的数据类型。这是源代码,包括两个正在进行转换的函数:

data BFinstruction
  = AdjustValue Int
  | MovePointer Int
  | GetChar
  | PutChar
  | Loop BFcode
  deriving (Eq)

type BFcode = [BFinstruction]

unsafeCompileBrainfuck :: String -> BFcode
unsafeCompileBrainfuck = fst . parse [] where
  -- arguments: input string, built code; output: output code, rest of input
  parse :: BFcode -> String -> (BFcode,String)
  parse c ('+':s) = parse (AdjustValue   1 :c) s
  parse c ('-':s) = parse (AdjustValue (-1):c) s
  parse c ('>':s) = parse (MovePointer   1 :c) s
  parse c ('<':s) = parse (MovePointer (-1):c) s
  parse c ('.':s) = parse (PutChar         :c) s
  parse c (',':s) = parse (GetChar         :c) s
  parse c (']':s) = (reverse c, s)
  parse c ('[':s) = parse (Loop l          :c) s' where (l,s') = parse [] s
  parse c []      = (reverse c ,"")
  parse c ( _ :s) = parse                   c  s

simplifyBrainfuck :: BFcode -> BFcode
simplifyBrainfuck ((AdjustValue x):(AdjustValue y):zs) = if x + y /= 0
  then simplifyBrainfuck (AdjustValue (x + y):zs)
  else simplifyBrainfuck zs
simplifyBrainfuck ((MovePointer x):(MovePointer y):zs) = if x + y /= 0
  then simplifyBrainfuck (MovePointer (x + y):zs)
  else simplifyBrainfuck zs
simplifyBrainfuck (x                              :zs) = x: simplifyBrainfuck zs
simplifyBrainfuck []                                   = []

其想法是,代码将从某些输入(字符串)中读取,由上述代码进行准备和简化,然后由其他一些函数执行。 (假设输入有效)。

为了优化这个示例,我尝试通过执行以下操作来取消装箱 MovePointerAdjustValue 构造函数的 Int 参数:

data BFinstruction -- BangPatterns
  = AdjustValue {-# UNPACK #-} !Int
  | MovePointer {-# UNPACK #-} !Int
  | GetChar
  | PutChar
  | Loop BFcode
  deriving (Eq)

这将把装箱的 Int< /code> 类型转换为未装箱的原始 Int# 类型,这是 GHc 的实现细节。据我了解,这个选项只在少数情况下有用,所以我想问一下,如果我想执行这种优化,我必须注意哪些事情。我的目标是允许使用 Haskell 的优点来执行 BF 代码 - 懒惰(我想存档,代码只能根据需要保留在内存中)和简单性。

I'm trying to write a small script which parses and executes Brainfuck code, to understand the GHC options of optimization, I'm trying to optimize the code in order to be a bit faster and to understand what's going on there.

On of the parts is the internal represantation of BF-code, I use a special datatype for this. Here's the sourcecode, included the two functions which are doing the conversions:

data BFinstruction
  = AdjustValue Int
  | MovePointer Int
  | GetChar
  | PutChar
  | Loop BFcode
  deriving (Eq)

type BFcode = [BFinstruction]

unsafeCompileBrainfuck :: String -> BFcode
unsafeCompileBrainfuck = fst . parse [] where
  -- arguments: input string, built code; output: output code, rest of input
  parse :: BFcode -> String -> (BFcode,String)
  parse c ('+':s) = parse (AdjustValue   1 :c) s
  parse c ('-':s) = parse (AdjustValue (-1):c) s
  parse c ('>':s) = parse (MovePointer   1 :c) s
  parse c ('<':s) = parse (MovePointer (-1):c) s
  parse c ('.':s) = parse (PutChar         :c) s
  parse c (',':s) = parse (GetChar         :c) s
  parse c (']':s) = (reverse c, s)
  parse c ('[':s) = parse (Loop l          :c) s' where (l,s') = parse [] s
  parse c []      = (reverse c ,"")
  parse c ( _ :s) = parse                   c  s

simplifyBrainfuck :: BFcode -> BFcode
simplifyBrainfuck ((AdjustValue x):(AdjustValue y):zs) = if x + y /= 0
  then simplifyBrainfuck (AdjustValue (x + y):zs)
  else simplifyBrainfuck zs
simplifyBrainfuck ((MovePointer x):(MovePointer y):zs) = if x + y /= 0
  then simplifyBrainfuck (MovePointer (x + y):zs)
  else simplifyBrainfuck zs
simplifyBrainfuck (x                              :zs) = x: simplifyBrainfuck zs
simplifyBrainfuck []                                   = []

The idea is, that the code will be read from some input (string), preparsed and simplified by the above code and then executed by some other functions. (It is assumed, that the input is valid).

In order to optimize this example, I tried to unbox the Int params of the MovePointer and AdjustValue constructors by doing domething like this:

data BFinstruction -- BangPatterns
  = AdjustValue {-# UNPACK #-} !Int
  | MovePointer {-# UNPACK #-} !Int
  | GetChar
  | PutChar
  | Loop BFcode
  deriving (Eq)

This will turn the boxed Int type into an unboxed, raw Int# type, which is an implementation detail of GHc. As I read, this option is only good in a few cases, so I want to ask, of which things I have to pay attention if I want to perform this kind of optimization. My goal is to allow the execution of BF-code using the benefits of Haskell - laziness (I want to archieve, that the code may only be kept as needed in memory) and easiness.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

別甾虛僞 2024-09-25 05:32:37

这真的有必要吗?您是否遇到了此代码的性能问题,您认为这些问题是由装箱值造成的?如果没有,请不要打扰。

如果您确实认为是这种情况,请访问此页面GHC 手册中似乎以方便的列表格式提供了必要的限制。

主要观点似乎是,多态函数或名称与未装箱类型之间的任何类型的交互,如果编译器没有拒绝,仍然可能导致严重的空间泄漏。另外,在没有尝试的情况下,我怀疑您不会在溢出的情况下抛出异常,因此大概您应该自己检测这种情况。一个简单的测试就可以验证情况是否确实如此。

Is this really necessary? Are you encountering performance problems with this code that you think are the result of boxed values? If not, don't bother.

If you do believe this is the case, then this page in the GHC manual seems to provide the necessary restrictions in a convenient list format.

The main points appear to be that any kind of interaction between polymorphic functions or names and unboxed types that is not rejected by the compiler could still lead to nasty space leaks. Also, without trying it, I suspect that you will not get exceptions thrown in the case of an overflow, for example, so presumably you should detect this kind of thing yourself. A simple test could verify if this is actually the case or not.

我早已燃尽 2024-09-25 05:32:37

在我看来,这确实是一个过早的优化。当您有大量 BFInstruction 时,UNPACK 最有用。我怀疑你是否有足够的大脑代码来使其值得。我同意 Gian 的观点,测试应该足够简单,所以先这样做。

无论如何,对于 UNPACK 值需要注意的是,您不希望编译器必须重新装箱它们。您应该可以接受数字操作,但除此之外,您必须仔细查看您正在使用的函数,看看您是否曾经使用解压缩的值作为非严格参数。唯一确定的方法是查看核心或接口文件,看看哪些参数是严格的,哪些不是。与往常一样,请确保至少使用“-O”进行编译。

This really looks like a premature optimization to me. UNPACK is mostly useful when you have a very large number of BFInstructions sitting around. I doubt you'll ever have enough brainf**k code to make it worthwhile. I agree with Gian, it should be simple enough to test, so do that first.

In any case, the thing to be aware of with UNPACK'ed values is that you don't want the compiler to have to rebox them. You should be ok with numerical ops, but other than that you'd have to look carefully at the functions you're using to see if you ever use the unpacked values as a non-strict argument. The only way to be sure is to look at the core, or the interface files, to see what parameters are strict and which aren't. As always, be sure to compile with at least "-O".

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文