如何在 Common Lisp 中将字节数组转换为字符串？

发布于 2024-07-14 04:40:51 字数 506 浏览 10 评论 0原文

我正在调用一个有趣的 API，它返回一个字节数组，但我想要一个文本流。有没有一种简单的方法可以从字节数组中获取文本流？现在我只是把：放在一起，

(defun bytearray-to-string (bytes)
  (let ((str (make-string (length bytes))))
    (loop for byte across bytes
       for i from 0
       do (setf (aref str i) (code-char byte)))
    str))

然后将结果包装在 with-input-from-string 中，但这不是最好的方法。（另外，它的效率非常低。）

在这种情况下，我知道它始终是 ASCII，因此将其解释为 ASCII 或 UTF-8 就可以了。我正在使用支持 Unicode 的 SBCL，但我更喜欢一种可移植（甚至仅限 ASCII）的解决方案，而不是特定于 SBCL-Unicode 的解决方案。

原文

I'm calling a funny API that returns a byte array, but I want a text stream. Is there an easy way to get a text stream from a byte array? For now I just threw together:

(defun bytearray-to-string (bytes)
  (let ((str (make-string (length bytes))))
    (loop for byte across bytes
       for i from 0
       do (setf (aref str i) (code-char byte)))
    str))

and then wrap the result in with-input-from-string, but that can't be the best way. (Plus, it's horribly inefficient.)

In this case, I know it's always ASCII, so interpreting it as either ASCII or UTF-8 would be fine. I'm using Unicode-aware SBCL, but I'd prefer a portable (even ASCII-only) solution to a SBCL-Unicode-specific one.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

人间☆小暴躁 2024-07-21 04:40:52

有两个用于此转换的可移植库：

flexi-streams，已在另一个答案中提到。
这个库比较旧，有更多功能，特别是可扩展流。
Babel，专门用于字符编码和解码的库
Babel 相对于 Flexi-streams 的主要优势是速度。

为了获得最佳性能，如果 Babel 具有您需要的功能，请使用它，否则请使用 Flexi-streams。下面是一个（有点不科学的）微基准测试，说明了速度差异。

对于此测试用例，Babel 的速度快了 337 倍，并且需要的内存减少了 200 倍。

(asdf:operate 'asdf:load-op :flexi-streams)
(asdf:operate 'asdf:load-op :babel)

(defun flexi-streams-test (bytes n)
  (loop
     repeat n
     collect (flexi-streams:octets-to-string bytes :external-format :utf-8)))

(defun babel-test (bytes n)
  (loop
     repeat n
     collect (babel:octets-to-string bytes :encoding :utf-8)))

(defun test (&optional (data #(72 101 108 108 111))
                       (n 10000))
  (let* ((ub8-vector (coerce data '(simple-array (unsigned-byte 8) (*))))
         (result1 (time (flexi-streams-test ub8-vector n)))
         (result2 (time (babel-test ub8-vector n))))
    (assert (equal result1 result2))))

#|
CL-USER> (test)
Evaluation took:
  1.348 seconds of real time
  1.328083 seconds of user run time
  0.020002 seconds of system run time
  [Run times include 0.12 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  126,402,160 bytes consed.
Evaluation took:
  0.004 seconds of real time
  0.004 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  635,232 bytes consed.
|#

There are two portable libraries for this conversion:

flexi-streams, already mentioned in another answer.
This library is older and has more features, in particular the extensible streams.
Babel, a library specificially for character encoding and decoding
The main advantage of Babel over flexi-streams is speed.

For best performance, use Babel if it has the features you need, and fall back to flexi-streams otherwise. Below a (slighly unscientific) microbenchmark illustrating the speed difference.

For this test case, Babel is 337 times faster and needs 200 times less memory.

(asdf:operate 'asdf:load-op :flexi-streams)
(asdf:operate 'asdf:load-op :babel)

(defun flexi-streams-test (bytes n)
  (loop
     repeat n
     collect (flexi-streams:octets-to-string bytes :external-format :utf-8)))

(defun babel-test (bytes n)
  (loop
     repeat n
     collect (babel:octets-to-string bytes :encoding :utf-8)))

(defun test (&optional (data #(72 101 108 108 111))
                       (n 10000))
  (let* ((ub8-vector (coerce data '(simple-array (unsigned-byte 8) (*))))
         (result1 (time (flexi-streams-test ub8-vector n)))
         (result2 (time (babel-test ub8-vector n))))
    (assert (equal result1 result2))))

#|
CL-USER> (test)
Evaluation took:
  1.348 seconds of real time
  1.328083 seconds of user run time
  0.020002 seconds of system run time
  [Run times include 0.12 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  126,402,160 bytes consed.
Evaluation took:
  0.004 seconds of real time
  0.004 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  635,232 bytes consed.
|#

回复收藏 0 原文

凉月流沐 2024-07-21 04:40:52

如果您不必担心 UTF-8 编码（本质上，这意味着“只是纯 ASCII”），您也许可以使用 MAP：

(映射'字符串#'代码字符#(72 101 108 108 111))

回复收藏 0 原文

Bonjour°[大白 2024-07-21 04:40:52

我建议采用建议的 Flexistream 或 babel 解决方案。

但为了完整性和未来到达此页面的谷歌用户的利益，我想提一下 sbcl 自己的 sb-ext:octets-to-string:

   SB-EXT:OCTETS-TO-STRING is an external symbol in #<PACKAGE "SB-EXT">.
   Function: #<FUNCTION SB-EXT:OCTETS-TO-STRING>
   Its associated name (as in FUNCTION-LAMBDA-EXPRESSION) is
     SB-EXT:OCTETS-TO-STRING.
   The function's arguments are:  (VECTOR &KEY (EXTERNAL-FORMAT DEFAULT) (START 0)
                                          END)
   Its defined argument types are:
     ((VECTOR (UNSIGNED-BYTE 8)) &KEY (:EXTERNAL-FORMAT T) (:START T) (:END T))
   Its result type is:
     *

I say go with the proposed flexistream or babel solutions.

But just for completeness and the benefit of future googlers arriving at this page I want to mention sbcl's own sb-ext:octets-to-string:

   SB-EXT:OCTETS-TO-STRING is an external symbol in #<PACKAGE "SB-EXT">.
   Function: #<FUNCTION SB-EXT:OCTETS-TO-STRING>
   Its associated name (as in FUNCTION-LAMBDA-EXPRESSION) is
     SB-EXT:OCTETS-TO-STRING.
   The function's arguments are:  (VECTOR &KEY (EXTERNAL-FORMAT DEFAULT) (START 0)
                                          END)
   Its defined argument types are:
     ((VECTOR (UNSIGNED-BYTE 8)) &KEY (:EXTERNAL-FORMAT T) (:START T) (:END T))
   Its result type is:
     *

回复收藏 0 原文

Oo萌小芽oO 2024-07-21 04:40:52

SBCL 支持所谓的灰色流。这些是基于 CLOS 类和通用函数的可扩展流。您可以创建一个文本流子类来从字节数组中获取字符。

回复收藏 0 原文

葬花如无物 2024-07-21 04:40:52

尝试使用 FORMAT 功能。 (FORMAT NIL ...) 以字符串形式返回结果。

回复收藏 0 原文

日裸衫吸 2024-07-21 04:40:51

FLEXI-STREAMS (http://weitz.de/flexi-streams/) 具有便携式转换功能

(flexi-streams:octets-to-string #(72 101 108 108 111) :external-format :utf-8)

=>

"Hello"

或，如果你想要一个流：

(flexi-streams:make-flexi-stream
   (flexi-streams:make-in-memory-input-stream
      #(72 101 108 108 111))
   :external-format :utf-8)

将返回一个从字节向量读取文本的流

FLEXI-STREAMS (http://weitz.de/flexi-streams/) has portable conversion function

(flexi-streams:octets-to-string #(72 101 108 108 111) :external-format :utf-8)

=>

"Hello"

Or, if you want a stream:

(flexi-streams:make-flexi-stream
   (flexi-streams:make-in-memory-input-stream
      #(72 101 108 108 111))
   :external-format :utf-8)

will return a stream that reads the text from byte-vector

回复收藏 0 原文

~没有更多了~

关于作者

享受孤独

暂无简介

0 文章

0 评论

23 人气

关注发私信

友情链接

文江博客

如何在 Common Lisp 中将字节数组转换为字符串？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如何在 Common Lisp 中将字节数组转换为字符串？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（6）

关于作者

相关话题

热门标签

推荐作者

胡图图

zt006

z祗昰~

冰葑

野の

天空

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。