使用 Haskell 输出 UTF-8 编码的 ByteString

发布于 2024-08-18 14:39:21 字数 588 浏览 4 评论 0原文

我试图简单地将 UTF-8 编码的数据输出到控制台,简直是疯了。

我已成功使用 String 完成此操作,但现在我想使用 ByteString 完成相同的操作。有没有一种又好又快的方法来做到这一点?

这是我到目前为止所得到的,但它不起作用:

import Prelude hiding (putStr)
import Data.ByteString.Char8 (putStr, pack)

main :: IO ()
main = putStr $ pack "čušpajž日本語"

它打印出 uapaj~�,�,呃。

我想要最新的 GHC 6.12.1 的答案,尽管我也想听到以前版本的答案。

谢谢!

更新:简单地读取并输出相同的 UTF-8 编码文本行似乎可以正常工作。 (使用 Data.ByteString.Char8,我只需执行 putStr =<< getLine。)但是从 .hs 文件内部打包值,如上面的示例所示,拒绝正确输出...我一定做错了什么?

I'm going out of my mind trying to simply output UTF-8-encoded data to the console.

I've managed to accomplish this using String, but now I'd like to do the same with ByteString. Is there a nice and fast way to do this?

This is what I've got so far, and it's not working:

import Prelude hiding (putStr)
import Data.ByteString.Char8 (putStr, pack)

main :: IO ()
main = putStr $ pack "čušpajž日本語"

It prints out uapaj~�,�, ugh.

I'd like an answer for the newest GHC 6.12.1 best, although I'd like to hear answers for previous versions as well.

Thanks!

Update: Simply reading and outputting the same UTF-8-encoded line of text seems to work correctly. (Using Data.ByteString.Char8, I just do a putStr =<< getLine.) But packed values from inside the .hs file, as in the above example, refuse to output properly... I must be doing something wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

终弃我 2024-08-25 14:39:21

utf8-string 支持字节串。

import Prelude hiding (putStr)
import Data.ByteString.Char8 (putStr)
import Data.ByteString.UTF8 (fromString)

main :: IO ()
main = putStr $ fromString "čušpajž日本語"

utf8-string supports bytestrings.

import Prelude hiding (putStr)
import Data.ByteString.Char8 (putStr)
import Data.ByteString.UTF8 (fromString)

main :: IO ()
main = putStr $ fromString "čušpajž日本語"
枯叶蝶 2024-08-25 14:39:21

bytestrings 是字节字符串。当它们输出时,它们将被截断为 8 位,如 Data.ByteString.Char8 文档中所述。您需要通过 Hackage 上的 utf8-string 包将它们显式转换为 utf8,该包包含对字节串的支持。


但是,从 2011 年开始,您应该使用 text 包来实现快速、打包的 unicode 输出。 GHC 截断 Unicode 字符输出

您的示例变得更加简单:

{-# LANGUAGE OverloadedStrings #-}

import qualified Data.Text    as T
import qualified Data.Text.IO as T

main = T.putStrLn "čušpajž日本語"

如下所示:

$ runhaskell A.hs
čušpajž日本語

bytestrings are strings of bytes. When they're output, they will be truncated to 8 bits, as it describes in the documentation for Data.ByteString.Char8. You'll need to explicitly convert them to utf8 - via the utf8-string package on Hackage, which contains support for bytestrings.


However, as of 2011, you should use the text package, for fast, packed unicode output. GHC truncating Unicode character output

Your example becomes a lot simpler:

{-# LANGUAGE OverloadedStrings #-}

import qualified Data.Text    as T
import qualified Data.Text.IO as T

main = T.putStrLn "čušpajž日本語"

Like so:

$ runhaskell A.hs
čušpajž日本語
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文