我如何运行萨顿和巴顿的“强化学习”? Lisp 代码?

发布于 2024-07-13 01:31:52 字数 3349 浏览 8 评论 0原文

我最近读了很多关于强化学习的内容,我发现“强化学习:简介” 是一本出色的指南。 作者为许多工作示例提供了源代码,很有帮助。

在开始提问之前,我应该指出我对 lisp 的实际了解很少。 我知道基本概念及其工作原理,但我从未真正以有意义的方式使用过 lisp,所以很可能我只是在做一些令人难以置信的 n00b-ish 事情。 :)

另外,作者在他的页面上声明他不会回答有关他的代码的问题,所以我没有联系他,并认为 Stack Overflow 会是一个更好的选择。

我一直尝试使用 GNU 的 CLISP 和 SBCL 在 Linux 机器上运行代码,但无法运行它。 我不断收到使用任一解释器的错误的完整列表。 特别是,大多数代码似乎使用了文件“utilities.lisp”中包含的许多实用程序,该文件包含以下行

(defpackage :rss-utilities
  (:use :common-lisp :ccl)
  (:nicknames :ut))

(in-package :ut)

The :ccl 似乎指的是某种基于 Mac 的 lisp 版本,但我无法确认这一点,它可能只是其他一些代码包。

> * (load "utilities.lisp")
>
> debugger invoked on a
> SB-KERNEL:SIMPLE-PACKAGE-ERROR in
> thread #<THREAD "initial thread"
> RUNNING {100266AC51}>:   The name
> "CCL" does not designate any package.
> 
> Type HELP for debugger help, or
> (SB-EXT:QUIT) to exit from SBCL.
> 
> restarts (invokable by number or by
> possibly-abbreviated name):   0:
> [ABORT] Exit debugger, returning to
> top level.
> 
> (SB-INT:%FIND-PACKAGE-OR-LOSE "CCL")

我尝试删除这个特定的部分(将行更改为,

  (:use :common-lisp)

但这只会产生更多错误。

> ; in: LAMBDA NIL ;     (+
> RSS-UTILITIES::*MENUBAR-BOTTOM* ;     
> (/ (- RSS-UTILITIES::MAX-V
> RSS-UTILITIES::V-SIZE) 2)) ;  ; caught
> WARNING: ;   undefined variable:
> *MENUBAR-BOTTOM*
> 
> ;     (-
> RSS-UTILITIES::*SCREEN-HEIGHT*
> RSS-UTILITIES::*MENUBAR-BOTTOM*) ;  ;
> caught WARNING: ;   undefined
> variable: *SCREEN-HEIGHT*
> 
> ;     (IF RSS-UTILITIES::CONTAINER ;  
> (RSS-UTILITIES::POINT-H ;         
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::CONTAINER)) ;        
> RSS-UTILITIES::*SCREEN-WIDTH*) ;  ;
> caught WARNING: ;   undefined
> variable: *SCREEN-WIDTH*
> 
> ;     (RSS-UTILITIES::POINT-H
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ;  ; caught
> STYLE-WARNING: ;   undefined function:
> POINT-H
> 
> ;     (RSS-UTILITIES::POINT-V
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ;  ; caught
> STYLE-WARNING: ;   undefined function:
> POINT-V

有人知道如何运行此代码吗?我对 lisp 的所有事情完全一无所知吗?

更新 [2009 年 3 月]: 我安装了 Clozure,但仍然无法运行代码。

在 CCL 命令提示符下,该命令

(load "utilities.lisp")

会产生以下错误输出:

;Compiler warnings :
;   In CENTER-VIEW: Undeclared free variable *SCREEN-HEIGHT*
;   In CENTER-VIEW: Undeclared free variable *SCREEN-WIDTH*
;   In CENTER-VIEW: Undeclared free variable *MENUBAR-BOTTOM* (2 references)
> Error: Undefined function RANDOM-STATE called with arguments (64497 9) .
> While executing: CCL::READ-DISPATCH, in process listener(1).
> Type :GO to continue, :POP to abort, :R for a list of available restarts.
> If continued: Retry applying RANDOM-STATE to (64497 9).
> Type :? for other options.
1 >

不幸的是,我仍在学习 lisp,所以虽然我有这样的感觉。有些东西没有完全定义,我不太明白如何阅读这些错误消息。

I have been reading a lot about Reinforcement Learning lately, and I have found "Reinforcement Learning: An Introduction" to be an excellent guide. The author's helpfully provice source code for a lot of their worked examples.

Before I begin the question I should point out that my practical knowledge of lisp is minimal. I know the basic concepts and how it works, but I have never really used lisp in a meaningful way, so it is likely I am just doing something incredibly n00b-ish. :)

Also, the author states on his page that he will not answer questions about his code, so I did not contact him, and figured Stack Overflow would be a much better choice.

I have been trying to run the code on a linux machine, using both GNU's CLISP and SBCL but have not been able to run it. I keep getting a whole list of errors using either interpreter. In particular, most of the code appears to use a lot of utilities contained in a file 'utilities.lisp' which contains the lines

(defpackage :rss-utilities
  (:use :common-lisp :ccl)
  (:nicknames :ut))

(in-package :ut)

The :ccl seems to refer to some kind of Mac-based version of lisp, but I could not confirm this, it could just be some other package of code.

> * (load "utilities.lisp")
>
> debugger invoked on a
> SB-KERNEL:SIMPLE-PACKAGE-ERROR in
> thread #<THREAD "initial thread"
> RUNNING {100266AC51}>:   The name
> "CCL" does not designate any package.
> 
> Type HELP for debugger help, or
> (SB-EXT:QUIT) to exit from SBCL.
> 
> restarts (invokable by number or by
> possibly-abbreviated name):   0:
> [ABORT] Exit debugger, returning to
> top level.
> 
> (SB-INT:%FIND-PACKAGE-OR-LOSE "CCL")

I tried removing this particular piece (changing the line to

  (:use :common-lisp)

but that just created more errors.

> ; in: LAMBDA NIL ;     (+
> RSS-UTILITIES::*MENUBAR-BOTTOM* ;     
> (/ (- RSS-UTILITIES::MAX-V
> RSS-UTILITIES::V-SIZE) 2)) ;  ; caught
> WARNING: ;   undefined variable:
> *MENUBAR-BOTTOM*
> 
> ;     (-
> RSS-UTILITIES::*SCREEN-HEIGHT*
> RSS-UTILITIES::*MENUBAR-BOTTOM*) ;  ;
> caught WARNING: ;   undefined
> variable: *SCREEN-HEIGHT*
> 
> ;     (IF RSS-UTILITIES::CONTAINER ;  
> (RSS-UTILITIES::POINT-H ;         
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::CONTAINER)) ;        
> RSS-UTILITIES::*SCREEN-WIDTH*) ;  ;
> caught WARNING: ;   undefined
> variable: *SCREEN-WIDTH*
> 
> ;     (RSS-UTILITIES::POINT-H
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ;  ; caught
> STYLE-WARNING: ;   undefined function:
> POINT-H
> 
> ;     (RSS-UTILITIES::POINT-V
> (RSS-UTILITIES::VIEW-SIZE
> RSS-UTILITIES::VIEW)) ;  ; caught
> STYLE-WARNING: ;   undefined function:
> POINT-V

Anybody got any idea how I can run this code? Am I just totally ignorant of all things lisp?

UPDATE [March 2009]: I installed Clozure, but was still not able to get the code to run.

At the CCL command prompt, the command

(load "utilities.lisp")

results in the following error output:

;Compiler warnings :
;   In CENTER-VIEW: Undeclared free variable *SCREEN-HEIGHT*
;   In CENTER-VIEW: Undeclared free variable *SCREEN-WIDTH*
;   In CENTER-VIEW: Undeclared free variable *MENUBAR-BOTTOM* (2 references)
> Error: Undefined function RANDOM-STATE called with arguments (64497 9) .
> While executing: CCL::READ-DISPATCH, in process listener(1).
> Type :GO to continue, :POP to abort, :R for a list of available restarts.
> If continued: Retry applying RANDOM-STATE to (64497 9).
> Type :? for other options.
1 >

Unfortuately, I'm still learning about lisp, so while I have a sense that something is not fully defined, I do not really understand how to read these error messages.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

梦旅人picnic 2024-07-20 01:31:53

我的猜测是代码依赖于 CCL,因此使用 CCL 而不是 CLISP 或 SBCL。 您可以从这里下载:http://trac.clozure.com/openmcl

My guess is that the code is CCL-dependent, so use CCL instead of CLISP or SBCL. You can download it from here: http://trac.clozure.com/openmcl

停顿的约定 2024-07-20 01:31:53

该代码适用于 Macintosh Common Lisp (MCL)。 它只会在那里运行。 使用 Clozure CL (CCL) 不会有帮助。 您必须注释图形代码。 随机状态的内容对于 MCL 来说也有些特殊。 您必须将其移植到可移植的 Common Lisp (make-random-state 等)。 文件名对于 Mac 来说也是特殊的。

Clozure CL 是 Macintosh Common Lisp 的一个分支,但已更改为 Unix 约定(路径名等),并且不包括 MCL 的特殊图形代码。

That code is for Macintosh Common Lisp (MCL). It will only run there. Using Clozure CL (CCL) will not help. You would have to comment the graphics code. The random state stuff also is slightly special for MCL. You have to port it to portable Common Lisp (make-random-state, etc.). Also the file names are special for the Mac.

Clozure CL is a fork from Macintosh Common Lisp, but has be changed to Unix conventions (pathnames, ...) and does not include the special graphics code of MCL.

爱本泡沫多脆弱 2024-07-20 01:31:53

在 linux x86 上使用最新版本的 CCL,将此文件另存为 foo.lisp:

#+ccl (defun random-state (x y)
        (ccl::initialize-random-state x y))

(load "utilities.lisp")
(use-package 'rss-utilities)


(load "testbed.lisp")

(setup)
(init)

(print (runs 10 10 .1))

运行

~/svn/ccl/lx86cl -l foo.lisp

会打印一堆警告消息和所需的答案:

(-0.77201915 0.59691894 0.78171235 0.41514033 0.6744591 0.26383805 0.8981678 1.1274683 0.50265205 0.4081622)

要找出所需的 #'random-state defun,我猜测“# .(RANDOM-STATE 64497 9)”是来自 MCL 的序列化随机状态对象。 为了了解 CCL 如何处理这个问题,我检查了 CCL 中的 MAKE-RANDOM-STATE 输出:

$ ~/svn/ccl/lx86cl 
Welcome to Clozure Common Lisp Version 1.3-r11936  (LinuxX8632)!
? (make-random-state)
#.(CCL::INITIALIZE-RANDOM-STATE 64497 9)

Using the latest version of CCL on linux x86, with this file saved as foo.lisp:

#+ccl (defun random-state (x y)
        (ccl::initialize-random-state x y))

(load "utilities.lisp")
(use-package 'rss-utilities)


(load "testbed.lisp")

(setup)
(init)

(print (runs 10 10 .1))

Running

~/svn/ccl/lx86cl -l foo.lisp

prints a bunch of warning messages and the desired answer of:

(-0.77201915 0.59691894 0.78171235 0.41514033 0.6744591 0.26383805 0.8981678 1.1274683 0.50265205 0.4081622)

To figure out the required #'random-state defun, I guessed that the “#.(RANDOM-STATE 64497 9)” was a serialized random-state object from MCL. To see how CCL handles that, I checked what MAKE-RANDOM-STATE outputs in CCL:

$ ~/svn/ccl/lx86cl 
Welcome to Clozure Common Lisp Version 1.3-r11936  (LinuxX8632)!
? (make-random-state)
#.(CCL::INITIALIZE-RANDOM-STATE 64497 9)
瑾夏年华 2024-07-20 01:31:53

如果您从未以有意义的方式使用过 lisp,可以使用 Matlab 代码 “强化学习:简介”。

If you have never used lisp in a meaningful way, there is a Matlab code for "Reinforcement Learning: An Introduction".

生来就爱笑 2024-07-20 01:31:53

除了 Rainer Joswig 的回答:安装 Clozure 后,您必须更新对函数 RANDOM- 的引用将 utilities.lisp 中的 STATE 更改为 random-mrg31k3p-state

更具体地说,将 #.(RANDOM-STATE 64497 9) 替换为 #.(ccl::random-mrg31k3p-state)

random-mrg31k3p-state code> 似乎在代码编写后的某个时间替换了 random-state 参见 l1-numbers.lisp?rev=13327

In addition to Rainer Joswig's answer: Once you install Clozure you'll have to update references to the function RANDOM-STATE in utilities.lisp to random-mrg31k3p-state.

More specifically replace: #.(RANDOM-STATE 64497 9) with #.(ccl::random-mrg31k3p-state)

random-mrg31k3p-state seems to have replaced random-state sometime after the code was written see l1-numbers.lisp?rev=13327

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文