Haskell http 响应结果不可读

发布于 2024-12-07 06:43:13 字数 580 浏览 0 评论 0原文

import Network.URI
import Network.HTTP
import Network.Browser

get :: URI -> IO String
get uri = do
  let req = Request uri GET [] ""
  resp <- browse $ do
    setAllowRedirects True -- handle HTTP redirects
    request req
  return $ rspBody $ snd resp

main = do
  case parseURI "http://cn.bing.com/search?q=hello" of
    Nothing -> putStrLn "Invalid search"
    Just uri -> do
        body <- get uri
        writeFile "output.txt" body

这是 haskell 输出和curl 输出之间的差异

vimdiff

import Network.URI
import Network.HTTP
import Network.Browser

get :: URI -> IO String
get uri = do
  let req = Request uri GET [] ""
  resp <- browse $ do
    setAllowRedirects True -- handle HTTP redirects
    request req
  return $ rspBody $ snd resp

main = do
  case parseURI "http://cn.bing.com/search?q=hello" of
    Nothing -> putStrLn "Invalid search"
    Just uri -> do
        body <- get uri
        writeFile "output.txt" body

Here is the diff between haskell output and curl output

vimdiff

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

冷心人i 2024-12-14 06:43:13

此处使用 String 作为中间数据类型可能不是一个好主意,因为它会在读取 HTTP 响应和写入文件时导致字符转换。如果这些转换不一致,则可能会导致损坏,就像它们在这里一样。

由于您只想直接复制字节,因此最好使用ByteString。我选择在这里使用惰性ByteString,这样它就不必一次全部加载到内存中,而是可以惰性地流式传输到文件中,就像String< /代码>。

import Network.URI
import Network.HTTP
import Network.Browser
import qualified Data.ByteString.Lazy as L

get :: URI -> IO L.ByteString
get uri = do
  let req = Request uri GET [] L.empty
  resp <- browse $ do
    setAllowRedirects True -- handle HTTP redirects
    request req
  return $ rspBody $ snd resp

main = do
  case parseURI "http://cn.bing.com/search?q=hello" of
    Nothing -> putStrLn "Invalid search"
    Just uri -> do
        body <- get uri
        L.writeFile "output.txt" body

幸运的是,Network.Browser 中的函数已重载,因此对惰性字节串的更改只需将请求正文更改为 L.empty,替换 writeFile > 与L.writeFile,以及更改函数的类型签名。

It's probably not a good idea to use String as the intermediate data type here, as it will cause character conversions both when reading the HTTP response, and when writing to the file. This can cause corruption if these conversions are nor consistent, as it would appear they are here.

Since you just want to copy the bytes directly, it's better to use a ByteString. I've chosen to use a lazy ByteString here, so that it does not have to be loaded into memory all at once, but can be streamed lazily into the file, just like with String.

import Network.URI
import Network.HTTP
import Network.Browser
import qualified Data.ByteString.Lazy as L

get :: URI -> IO L.ByteString
get uri = do
  let req = Request uri GET [] L.empty
  resp <- browse $ do
    setAllowRedirects True -- handle HTTP redirects
    request req
  return $ rspBody $ snd resp

main = do
  case parseURI "http://cn.bing.com/search?q=hello" of
    Nothing -> putStrLn "Invalid search"
    Just uri -> do
        body <- get uri
        L.writeFile "output.txt" body

Fortunately, the functions in Network.Browser are overloaded so that the change to lazy bytestrings only involves changing the request body to L.empty, replacing writeFile with L.writeFile, as well as changing the type signature of the function.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文