我如何运行萨顿和巴顿的“强化学习”? Lisp 代码?
我最近读了很多关于强化学习的内容,我发现“强化学习:简介” 是一本出色的指南。 作者为许多工作示例提供了源代码,很有帮助。
在开始提问之前,我应该指出我对 lisp 的实际了解很少。 我知道基本概念及其工作原理,但我从未真正以有意义的方式使用过 lisp,所以很可能我只是在做一些令人难以置信的 n00b-ish 事情。 :)
另外,作者在他的页面上声明他不会回答有关他的代码的问题,所以我没有联系他,并认为 Stack Overflow 会是一个更好的选择。
我一直尝试使用 GNU 的 CLISP 和 SBCL 在 Linux 机器上运行代码,但无法运行它。 我不断收到使用任一解释器的错误的完整列表。 特别是,大多数代码似乎使用了文件“utilities.lisp”中包含的许多实用程序,该文件包含以下行
(defpackage :rss-utilities
(:use :common-lisp :ccl)
(:nicknames :ut))
(in-package :ut)
The :ccl 似乎指的是某种基于 Mac 的 lisp 版本,但我无法确认这一点,它可能只是其他一些代码包。
> * (load "utilities.lisp")
>
> debugger invoked on a
> SB-KERNEL:SIMPLE-PACKAGE-ERROR in
> thread #<THREAD "initial thread"
> RUNNING {100266AC51}>: The name
> "CCL" does not designate any package.
>
> Type HELP for debugger help, or
> (SB-EXT:QUIT) to exit from SBCL.
>
> restarts (invokable by number or by
> possibly-abbreviated name): 0:
> [ABORT] Exit debugger, returning to
> top level.
>
> (SB-INT:%FIND-PACKAGE-OR-LOSE "CCL")
我尝试删除这个特定的部分(将行更改为,
(:use :common-lisp)
但这只会产生更多错误。
> ; in: LAMBDA NIL ; (+
> RSS-UTILITIES::*MENUBAR-BOTTOM* ;
> (/ (- RSS-UTILITIES::MAX-V
> RSS-UTILITIES::V-SIZE) 2)) ; ; caught
> WARNING: ; undefined variable:
> *MENUBAR-BOTTOM*
>
> ; (-
> RSS-UTILITIES::*SCREEN-HEIGHT*
> RSS-UTILITIES::*MENUBAR-BOTTOM*) ; ;
> caught WARNING: ; undefined
> variable: *SCREEN-HEIGHT*
>
> ; (IF RSS-UTILITIES::CONTAINER ;
> (RSS-UTILITIES::POINT-H ;
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::CONTAINER)) ;
> RSS-UTILITIES::*SCREEN-WIDTH*) ; ;
> caught WARNING: ; undefined
> variable: *SCREEN-WIDTH*
>
> ; (RSS-UTILITIES::POINT-H
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ; ; caught
> STYLE-WARNING: ; undefined function:
> POINT-H
>
> ; (RSS-UTILITIES::POINT-V
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ; ; caught
> STYLE-WARNING: ; undefined function:
> POINT-V
有人知道如何运行此代码吗?我对 lisp 的所有事情完全一无所知吗?
更新 [2009 年 3 月]: 我安装了 Clozure,但仍然无法运行代码。
在 CCL 命令提示符下,该命令
(load "utilities.lisp")
会产生以下错误输出:
;Compiler warnings :
; In CENTER-VIEW: Undeclared free variable *SCREEN-HEIGHT*
; In CENTER-VIEW: Undeclared free variable *SCREEN-WIDTH*
; In CENTER-VIEW: Undeclared free variable *MENUBAR-BOTTOM* (2 references)
> Error: Undefined function RANDOM-STATE called with arguments (64497 9) .
> While executing: CCL::READ-DISPATCH, in process listener(1).
> Type :GO to continue, :POP to abort, :R for a list of available restarts.
> If continued: Retry applying RANDOM-STATE to (64497 9).
> Type :? for other options.
1 >
不幸的是,我仍在学习 lisp,所以虽然我有这样的感觉。有些东西没有完全定义,我不太明白如何阅读这些错误消息。
I have been reading a lot about Reinforcement Learning lately, and I have found "Reinforcement Learning: An Introduction" to be an excellent guide. The author's helpfully provice source code for a lot of their worked examples.
Before I begin the question I should point out that my practical knowledge of lisp is minimal. I know the basic concepts and how it works, but I have never really used lisp in a meaningful way, so it is likely I am just doing something incredibly n00b-ish. :)
Also, the author states on his page that he will not answer questions about his code, so I did not contact him, and figured Stack Overflow would be a much better choice.
I have been trying to run the code on a linux machine, using both GNU's CLISP and SBCL but have not been able to run it. I keep getting a whole list of errors using either interpreter. In particular, most of the code appears to use a lot of utilities contained in a file 'utilities.lisp' which contains the lines
(defpackage :rss-utilities
(:use :common-lisp :ccl)
(:nicknames :ut))
(in-package :ut)
The :ccl seems to refer to some kind of Mac-based version of lisp, but I could not confirm this, it could just be some other package of code.
> * (load "utilities.lisp")
>
> debugger invoked on a
> SB-KERNEL:SIMPLE-PACKAGE-ERROR in
> thread #<THREAD "initial thread"
> RUNNING {100266AC51}>: The name
> "CCL" does not designate any package.
>
> Type HELP for debugger help, or
> (SB-EXT:QUIT) to exit from SBCL.
>
> restarts (invokable by number or by
> possibly-abbreviated name): 0:
> [ABORT] Exit debugger, returning to
> top level.
>
> (SB-INT:%FIND-PACKAGE-OR-LOSE "CCL")
I tried removing this particular piece (changing the line to
(:use :common-lisp)
but that just created more errors.
> ; in: LAMBDA NIL ; (+
> RSS-UTILITIES::*MENUBAR-BOTTOM* ;
> (/ (- RSS-UTILITIES::MAX-V
> RSS-UTILITIES::V-SIZE) 2)) ; ; caught
> WARNING: ; undefined variable:
> *MENUBAR-BOTTOM*
>
> ; (-
> RSS-UTILITIES::*SCREEN-HEIGHT*
> RSS-UTILITIES::*MENUBAR-BOTTOM*) ; ;
> caught WARNING: ; undefined
> variable: *SCREEN-HEIGHT*
>
> ; (IF RSS-UTILITIES::CONTAINER ;
> (RSS-UTILITIES::POINT-H ;
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::CONTAINER)) ;
> RSS-UTILITIES::*SCREEN-WIDTH*) ; ;
> caught WARNING: ; undefined
> variable: *SCREEN-WIDTH*
>
> ; (RSS-UTILITIES::POINT-H
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ; ; caught
> STYLE-WARNING: ; undefined function:
> POINT-H
>
> ; (RSS-UTILITIES::POINT-V
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ; ; caught
> STYLE-WARNING: ; undefined function:
> POINT-V
Anybody got any idea how I can run this code? Am I just totally ignorant of all things lisp?
UPDATE [March 2009]: I installed Clozure, but was still not able to get the code to run.
At the CCL command prompt, the command
(load "utilities.lisp")
results in the following error output:
;Compiler warnings :
; In CENTER-VIEW: Undeclared free variable *SCREEN-HEIGHT*
; In CENTER-VIEW: Undeclared free variable *SCREEN-WIDTH*
; In CENTER-VIEW: Undeclared free variable *MENUBAR-BOTTOM* (2 references)
> Error: Undefined function RANDOM-STATE called with arguments (64497 9) .
> While executing: CCL::READ-DISPATCH, in process listener(1).
> Type :GO to continue, :POP to abort, :R for a list of available restarts.
> If continued: Retry applying RANDOM-STATE to (64497 9).
> Type :? for other options.
1 >
Unfortuately, I'm still learning about lisp, so while I have a sense that something is not fully defined, I do not really understand how to read these error messages.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
我的猜测是代码依赖于 CCL,因此使用 CCL 而不是 CLISP 或 SBCL。 您可以从这里下载:http://trac.clozure.com/openmcl
My guess is that the code is CCL-dependent, so use CCL instead of CLISP or SBCL. You can download it from here: http://trac.clozure.com/openmcl
该代码适用于 Macintosh Common Lisp (MCL)。 它只会在那里运行。 使用 Clozure CL (CCL) 不会有帮助。 您必须注释图形代码。 随机状态的内容对于 MCL 来说也有些特殊。 您必须将其移植到可移植的 Common Lisp (make-random-state 等)。 文件名对于 Mac 来说也是特殊的。
Clozure CL 是 Macintosh Common Lisp 的一个分支,但已更改为 Unix 约定(路径名等),并且不包括 MCL 的特殊图形代码。
That code is for Macintosh Common Lisp (MCL). It will only run there. Using Clozure CL (CCL) will not help. You would have to comment the graphics code. The random state stuff also is slightly special for MCL. You have to port it to portable Common Lisp (make-random-state, etc.). Also the file names are special for the Mac.
Clozure CL is a fork from Macintosh Common Lisp, but has be changed to Unix conventions (pathnames, ...) and does not include the special graphics code of MCL.
在 linux x86 上使用最新版本的 CCL,将此文件另存为 foo.lisp:
运行
会打印一堆警告消息和所需的答案:
要找出所需的 #'random-state defun,我猜测“# .(RANDOM-STATE 64497 9)”是来自 MCL 的序列化随机状态对象。 为了了解 CCL 如何处理这个问题,我检查了 CCL 中的 MAKE-RANDOM-STATE 输出:
Using the latest version of CCL on linux x86, with this file saved as foo.lisp:
Running
prints a bunch of warning messages and the desired answer of:
To figure out the required #'random-state defun, I guessed that the “#.(RANDOM-STATE 64497 9)” was a serialized random-state object from MCL. To see how CCL handles that, I checked what MAKE-RANDOM-STATE outputs in CCL:
如果您从未以有意义的方式使用过 lisp,可以使用 Matlab 代码 “强化学习:简介”。
If you have never used lisp in a meaningful way, there is a Matlab code for "Reinforcement Learning: An Introduction".
除了 Rainer Joswig 的回答:安装 Clozure 后,您必须更新对函数
RANDOM- 的引用将
更改为utilities.lisp
中的 STATErandom-mrg31k3p-state
。更具体地说,将
#.(RANDOM-STATE 64497 9)
替换为#.(ccl::random-mrg31k3p-state)
random-mrg31k3p-state
code> 似乎在代码编写后的某个时间替换了random-state
参见 l1-numbers.lisp?rev=13327In addition to Rainer Joswig's answer: Once you install Clozure you'll have to update references to the function
RANDOM-STATE
inutilities.lisp
torandom-mrg31k3p-state
.More specifically replace:
#.(RANDOM-STATE 64497 9)
with#.(ccl::random-mrg31k3p-state)
random-mrg31k3p-state
seems to have replacedrandom-state
sometime after the code was written see l1-numbers.lisp?rev=13327