在 Common Lisp 中读取外部程序的二进制输出

发布于 2024-12-25 20:46:43 字数 926 浏览 1 评论 0原文

我正在尝试在 SBCL 中运行外部程序并捕获其输出。 输出是二进制数据(png 图像),而 SBCL 坚持将其解释为字符串。

我尝试了多种方法,例如

(trivial-shell:shell-command "/path/to/png-generator" :input "some input")

(with-input-from-string (input "some input")
  (with-output-to-string (output)
    (run-program "/path/to/png-generator" () :input input :output output))


(with-input-from-string (input "some input")
  (flexi-streams:with-output-to-sequence (output)
    (run-program "/path/to/png-generator" () :input input :output output))

但我收到错误,例如

Illegal :UTF-8 character starting at byte position 0.

在我看来,SBCL 正在尝试将二进制数据解释为文本并对其进行解码。我该如何改变这种行为?我只对获得八位位组向量感兴趣。

编辑:由于上面的文本不清楚,我想补充一点,至少在 Flexi-stream 的情况下,流的元素类型是 flexi-streams:octect (这是一个(unsigned-byte 8))。 我希望至少在这种情况下 run-program 能够读取原始字节而不会出现很多问题。相反,我收到一条消息,例如不知道如何复制到元素类型流(UNSIGNED-BYTE 8)

I'm trying to run an external program in SBCL and capture its output.
The output is binary data (a png image), while SBCL insists on interpreting it as strings.

I tried a number of ways, like

(trivial-shell:shell-command "/path/to/png-generator" :input "some input")

(with-input-from-string (input "some input")
  (with-output-to-string (output)
    (run-program "/path/to/png-generator" () :input input :output output))


(with-input-from-string (input "some input")
  (flexi-streams:with-output-to-sequence (output)
    (run-program "/path/to/png-generator" () :input input :output output))

But I get errors like

Illegal :UTF-8 character starting at byte position 0.

It seems to me that SBCL is trying to interpret the binary data as a text and decode it. How do I change this behaviour ? I'm interested only in obtaining a vector of octets.

Edit: Since it is not clear from the text above, I'd like to add that at least in the case of flexi-stream, the element-type of the stream is a flexi-streams:octect (which is a (unsigned-byte 8)).
I would expect at least in this case run-program to read the raw bytes without many issues. Instead I get a message like Don't know how to copy to stream of element-type (UNSIGNED-BYTE 8)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

琉璃梦幻 2025-01-01 20:46:43

编辑:我对无法完成这个非常简单的任务感到生气并解决了问题。

从功能上讲,将 UNSIGNED-BYTE 类型的流发送到运行程序并使其正常工作的能力受到严重限制,原因我不明白。我尝试了灰色流、灵活流、fd 流和其他一些机制,就像你一样。

然而,仔细阅读运行程序的源代码(第五次或第六次),我注意到有一个选项:STREAM,您可以传递到输出。鉴于此,我想知道 read-byte 是否有效……而且它确实有效。为了获得更高性能的工作,可以确定如何获取非文件流的长度并对其运行 READ-SEQUENCE。

(let* 
       ;; Get random bytes
      ((proc-var (sb-ext:run-program "head" '("-c" "10" "/dev/urandom")
                                     :search t
       ;; let SBCL figure out the storage type. This is what solved the problem.
                                     :output :stream))
       ;; Obtain the streams from the process object.
       (output (process-output proc-var))
       (err (process-error proc-var)))
  (values
   ;;return both stdout and stderr, just for polish.
   ;; do a byte read and turn it into a vector.
   (concatenate 'vector
                ;; A byte with value 0 is *not* value nil. Yay for Lisp!
                (loop for byte = (read-byte output nil)
                   while byte
                   collect byte))
   ;; repeat for stderr
   (concatenate 'vector
                (loop for byte = (read-byte err nil)
                   while byte
                   collect byte))))

Edit: I got angry at not being able to do this very simple task and solved the problem.

Functionally, the ability to send a stream of type UNSIGNED-BYTE into run-program and have it work correctly is severely limited, for reasons I don't understand. I tried gray streams, flexi-streams, fd streams, and a few other mechanisms, like you.

However, perusing run-program's source (for the fifth or sixth time), I noticed that there's an option :STREAM you can pass to output. Given that, I wondered if read-byte would work... and it did. For more performant work, one could determine how to get the length of a non-file stream and run READ-SEQUENCE on it.

(let* 
       ;; Get random bytes
      ((proc-var (sb-ext:run-program "head" '("-c" "10" "/dev/urandom")
                                     :search t
       ;; let SBCL figure out the storage type. This is what solved the problem.
                                     :output :stream))
       ;; Obtain the streams from the process object.
       (output (process-output proc-var))
       (err (process-error proc-var)))
  (values
   ;;return both stdout and stderr, just for polish.
   ;; do a byte read and turn it into a vector.
   (concatenate 'vector
                ;; A byte with value 0 is *not* value nil. Yay for Lisp!
                (loop for byte = (read-byte output nil)
                   while byte
                   collect byte))
   ;; repeat for stderr
   (concatenate 'vector
                (loop for byte = (read-byte err nil)
                   while byte
                   collect byte))))
无悔心 2025-01-01 20:46:43

如果您愿意使用一些外部库,可以使用 babel-streams 来完成。这是我用来安全地从程序中获取内容的函数。我使用 :latin-1 因为它将前 256 个字节仅映射到字符。您可以删除八位位组到字符串并获得向量。

如果您也想要 stderr,则可以使用嵌套的“with-output-to-sequence”来获取两者。

(defun safe-shell (command &rest args)                                                                                                           
  (octets-to-string                                                                                                                              
   (with-output-to-sequence (stream :external-format :latin-1)                                                                                   
     (let ((proc (sb-ext:run-program command args :search t :wait t :output stream)))                                                            
       (case (sb-ext:process-status proc)                                                                                                        
         (:exited (unless (zerop (sb-ext:process-exit-code proc))                                                                                
                    (error "Error in command")))                                                                                                 
         (t (error "Unable to terminate process")))))                                                                                            
   :encoding :latin-1))                                                                                                                          

If you're willing to use some external libraries, this can be done with babel-streams. This is a function I use to safely get content from a program. I use :latin-1 because it maps the first 256 bytes just to the characters. You could remove the octets-to-string and have the vector.

If you wanted stderr as well, you could use nested 'with-output-to-sequence' to get both.

(defun safe-shell (command &rest args)                                                                                                           
  (octets-to-string                                                                                                                              
   (with-output-to-sequence (stream :external-format :latin-1)                                                                                   
     (let ((proc (sb-ext:run-program command args :search t :wait t :output stream)))                                                            
       (case (sb-ext:process-status proc)                                                                                                        
         (:exited (unless (zerop (sb-ext:process-exit-code proc))                                                                                
                    (error "Error in command")))                                                                                                 
         (t (error "Unable to terminate process")))))                                                                                            
   :encoding :latin-1))                                                                                                                          
傾城如夢未必闌珊 2025-01-01 20:46:43

Paul Nathan 已经就如何从程序中以二进制方式读取 I/O 给出了相当完整的答案,因此我将添加为什么您的代码不起作用:因为您明确要求 SBCL 使用 with-{in,out}put-to-string 将 I/O 解释为 UTF-8 字符的字符串。

另外,我想指出的是,您无需深入了解 run-program 的源代码即可找到解决方案。 SBCL 手册中对此进行了明确记录。

Paul Nathan already gave a pretty complete answer as to how to read I/O from a program as binary, so I'll just add why your code didn't work: because you explicitely asked SBCL to interpret the I/O as a string of UTF-8 characters, using with-{in,out}put-to-string.

Also, I'd like to point that you don't need to go as far as run-program's source code to get to the solution. It's clearly documented in SBCL's manual.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文