使用 Haskell 输出 UTF-8 编码的 ByteString
我试图简单地将 UTF-8 编码的数据输出到控制台,简直是疯了。
我已成功使用 String
完成此操作,但现在我想使用 ByteString
完成相同的操作。有没有一种又好又快的方法来做到这一点?
这是我到目前为止所得到的,但它不起作用:
import Prelude hiding (putStr)
import Data.ByteString.Char8 (putStr, pack)
main :: IO ()
main = putStr $ pack "čušpajž日本語"
它打印出 uapaj~�,�
,呃。
我想要最新的 GHC 6.12.1 的答案,尽管我也想听到以前版本的答案。
谢谢!
更新:简单地读取并输出相同的 UTF-8 编码文本行似乎可以正常工作。 (使用 Data.ByteString.Char8
,我只需执行 putStr =<< getLine
。)但是从 .hs 文件内部打包值,如上面的示例所示,拒绝正确输出...我一定做错了什么?
I'm going out of my mind trying to simply output UTF-8-encoded data to the console.
I've managed to accomplish this using String
, but now I'd like to do the same with ByteString
. Is there a nice and fast way to do this?
This is what I've got so far, and it's not working:
import Prelude hiding (putStr)
import Data.ByteString.Char8 (putStr, pack)
main :: IO ()
main = putStr $ pack "čušpajž日本語"
It prints out uapaj~�,�
, ugh.
I'd like an answer for the newest GHC 6.12.1 best, although I'd like to hear answers for previous versions as well.
Thanks!
Update: Simply reading and outputting the same UTF-8-encoded line of text seems to work correctly. (Using Data.ByteString.Char8
, I just do a putStr =<< getLine
.) But packed values from inside the .hs file, as in the above example, refuse to output properly... I must be doing something wrong?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
utf8-string
支持字节串。utf8-string
supports bytestrings.bytestrings
是字节字符串。当它们输出时,它们将被截断为 8 位,如Data.ByteString.Char8
文档中所述。您需要通过 Hackage 上的 utf8-string 包将它们显式转换为 utf8,该包包含对字节串的支持。但是,从 2011 年开始,您应该使用
text
包来实现快速、打包的 unicode 输出。 GHC 截断 Unicode 字符输出您的示例变得更加简单:
如下所示:
bytestrings
are strings of bytes. When they're output, they will be truncated to 8 bits, as it describes in the documentation forData.ByteString.Char8
. You'll need to explicitly convert them to utf8 - via theutf8-string
package on Hackage, which contains support for bytestrings.However, as of 2011, you should use the
text
package, for fast, packed unicode output. GHC truncating Unicode character outputYour example becomes a lot simpler:
Like so:
这是一个已知的 ghc 错误,标记为“wontfix”。
This is a known ghc bug, marked "wontfix".