Lisp 的奇怪 HTTP 问题/错误
我正在尝试了解有关 SBCL 中处理套接字和网络连接的更多信息; 所以我为 HTTP 编写了一个简单的包装器。 到目前为止,它只是创建一个流并执行请求以最终获取网站的标头数据和页面内容。
到目前为止,它的运作还算不错。 没什么值得吹嘘的,但至少有效。
然而,我遇到了一个奇怪的问题; 我不断收到“400 Bad Request”错误。
起初,我对如何处理 HTTP 请求(或多或少将请求字符串作为函数参数传递)有些怀疑,然后我创建了一个函数,用我需要的所有部分格式化查询字符串并将其返回以供使用后来...但我仍然收到错误。
更奇怪的是,错误并不是每次都会发生。 如果我在像 Google 这样的页面上尝试该脚本,我会得到“200 Ok”返回值...但在其他网站上的其他时候,我会得到“400 Bad Request”。
我确信我的代码有问题,但如果我确切知道是什么原因造成的,那我就该死了。
这是我正在使用的代码:
(use-package :sb-bsd-sockets)
(defun read-buf-nonblock (buffer stream)
(let ((eof (gensym)))
(do ((i 0 (1+ i))
(c (read-char stream nil eof)
(read-char-no-hang stream nil eof)))
((or (>= i (length buffer)) (not c) (eq c eof)) i)
(setf (elt buffer i) c))))
(defun http-connect (host &optional (port 80))
"Create I/O stream to given host on a specified port"
(let ((socket (make-instance 'inet-socket
:type :stream
:protocol :tcp)))
(socket-connect
socket (car (host-ent-addresses (get-host-by-name host))) port)
(let ((stream (socket-make-stream socket
:input t
:output t
:buffering :none)))
stream)))
(defun http-request (stream request &optional (buffer 1024))
"Perform HTTP request on a specified stream"
(format stream "~a~%~%" request )
(let ((data (make-string buffer)))
(setf data (subseq data 0
(read-buf-nonblock data
stream)))
(princ data)
(> (length data) 0)))
(defun request (host request)
"formated HTTP request"
(format nil "~a HTTP/1.0 Host: ~a" request host))
(defun get-page (host &optional (request "GET /"))
"simple demo to get content of a page"
(let ((stream (http-connect host)))
(http-request stream (request host request)))
I'm attempting to learn a little more about handling sockets and network connections in SBCL; so I wrote a simple wrapper for HTTP. Thus far, it merely makes a stream and performs a request to ultimately get the header data and page content of a website.
Until now, it has worked at somewhat decently. Nothing to brag home about, but it at least worked.
I have come across a strange problem, however; I keep getting "400 Bad Request" errors.
At first, I was somewhat leery about how I was processing the HTTP requests (more or less passing a request string as a function argument), then I made a function that formats a query string with all the parts I need and returns it for use later... but I still get errors.
What's even more odd is that the errors don't happen every time. If I try the script on a page like Google, I get a "200 Ok" return value... but at other times on other sites, I'll get "400 Bad Request".
I'm certain its a problem with my code, but I'll be damned if I know exactly what is causing it.
Here is the code that I am working with:
(use-package :sb-bsd-sockets)
(defun read-buf-nonblock (buffer stream)
(let ((eof (gensym)))
(do ((i 0 (1+ i))
(c (read-char stream nil eof)
(read-char-no-hang stream nil eof)))
((or (>= i (length buffer)) (not c) (eq c eof)) i)
(setf (elt buffer i) c))))
(defun http-connect (host &optional (port 80))
"Create I/O stream to given host on a specified port"
(let ((socket (make-instance 'inet-socket
:type :stream
:protocol :tcp)))
(socket-connect
socket (car (host-ent-addresses (get-host-by-name host))) port)
(let ((stream (socket-make-stream socket
:input t
:output t
:buffering :none)))
stream)))
(defun http-request (stream request &optional (buffer 1024))
"Perform HTTP request on a specified stream"
(format stream "~a~%~%" request )
(let ((data (make-string buffer)))
(setf data (subseq data 0
(read-buf-nonblock data
stream)))
(princ data)
(> (length data) 0)))
(defun request (host request)
"formated HTTP request"
(format nil "~a HTTP/1.0 Host: ~a" request host))
(defun get-page (host &optional (request "GET /"))
"simple demo to get content of a page"
(let ((stream (http-connect host)))
(http-request stream (request host request)))
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
一些东西。 首先,对于您返回的 400 错误的担忧,我想到了几种可能性:
其他一些更通用的指针可以帮助您:
(read-buf-nonblock) 非常令人困惑。 符号“c”在哪里定义的? 为什么对 'eof' (gensym) 进行了处理,然后没有分配任何值? 它看起来非常像直接从命令式程序中取出的逐字节副本,然后放入 Lisp 中。 看起来您在这里重新实现的是(读取序列)。 去 Common Lisp Hyperspec 中的这里看看,看看这是否是你需要什么。 另一半是将您创建的套接字设置为非阻塞。 这非常简单,尽管 SBCL 文档几乎没有提及该主题。 使用这个:
<代码>(socket-make-stream 套接字
:输入t
:输出t
:缓冲:无
:timeout 0)
(http-connect) 的最后一个 (let) 形式不是必需的。 只是评估
<代码>(socket-make-stream 套接字
:输入t
:输出t
:buffering :none)
没有 let,http-connect 仍然应该返回正确的值。
替换:
with
和 make (read-buf-nonblock) 返回数据字符串,而不是让它在函数内分配。 因此,在分配了
buffer
的地方,在其中创建一个变量buffer
然后返回它。 您所做的事情称为依赖“副作用”,并且往往会产生更多错误并且更难发现错误。 仅在必要时才使用它,尤其是使用一种可以轻松不依赖它们的语言。哎呀,手受伤了。 但希望这会有所帮助。 打字完毕。 :-)
A few things. First, to your concern about the 400 errors you are getting back, a few possibilities come to mind:
Some other more general pointer to help you along your way:
(read-buf-nonblock) is very confusing. Where is the symbol 'c' defined? Why is 'eof' (gensym)ed and then not assigned any value? It looks very much like a byte-by-byte copy taken straight out of an imperative program, and plopped into Lisp. It looks like what you have reimplemented here is (read-sequence). Go look here in the Common Lisp Hyperspec, and see if this is what you need. The other half of this is to set your socket you created to be non-blocking. This is pretty easy, even though the SBCL documentation is almost silent on the topic. Use this:
(socket-make-stream socket
:input t
:output t
:buffering :none
:timeout 0)
The last (let) form of (http-connect) isn't necessary. Just evaluate
(socket-make-stream socket
:input t
:output t
:buffering :none)
without the let, and http-connect should still return the right value.
Replace:
with
and make (read-buf-nonblock) return the string of data, rather that having it assign within the function. So where you have
buffer
being assigned, create a variablebuffer
within and then return it. What you are doing is called relying on "side-effects," and tends to produce more errors and harder to find errors. Use it only when you have to, especially in a language that makes it easy not to depend on them.Yikes, hands hurt. But hopefully this helps. Done typing. :-)
这是一种可能性:
HTTP/1.0 将序列 CR LF 定义为行尾标记。
~%
格式指令生成一个#\Newline
(在大多数平台上为 LF,但请参阅 CLHS)。有些网站可能可以容忍缺少 CR,但其他网站则不然。
Here's a possibility:
HTTP/1.0 defines the sequence CR LF as the end-of-line marker.
The
~%
format directive is generating a#\Newline
(LF on most platforms, though see CLHS).Some sites may be tolerant of the missing CR, others not so much.