如何在 python 中检测 HTTP 请求 +扭曲?
我正在使用 python 中的twisted 10 学习网络编程。在下面的代码中,有什么方法可以在收到数据时检测 HTTP 请求吗?还可以从中检索域名、子域、端口值?如果不是http数据就丢弃它?
from twisted.internet import stdio, reactor, protocol
from twisted.protocols import basic
import re
class DataForwardingProtocol(protocol.Protocol):
def _ _init_ _(self):
self.output = None
self.normalizeNewlines = False
def dataReceived(self, data):
if self.normalizeNewlines:
data = re.sub(r"(\r\n|\n)", "\r\n", data)
if self.output:
self.output.write(data)
class StdioProxyProtocol(DataForwardingProtocol):
def connectionMade(self):
inputForwarder = DataForwardingProtocol( )
inputForwarder.output = self.transport
inputForwarder.normalizeNewlines = True
stdioWrapper = stdio.StandardIO(inputForwarder)
self.output = stdioWrapper
print "Connected to server. Press ctrl-C to close connection."
class StdioProxyFactory(protocol.ClientFactory):
protocol = StdioProxyProtocol
def clientConnectionLost(self, transport, reason):
reactor.stop( )
def clientConnectionFailed(self, transport, reason):
print reason.getErrorMessage( )
reactor.stop( )
if __name__ == '_ _main_ _':
import sys
if not len(sys.argv) == 3:
print "Usage: %s host port" % _ _file_ _
sys.exit(1)
reactor.connectTCP(sys.argv[1], int(sys.argv[2]), StdioProxyFactory( ))
reactor.run( )
I am learning network programming using twisted 10 in python. In below code is there any way to detect HTTP Request when data recieved? also retrieve Domain name, Sub Domain, Port values from this? Discard it if its not http data?
from twisted.internet import stdio, reactor, protocol
from twisted.protocols import basic
import re
class DataForwardingProtocol(protocol.Protocol):
def _ _init_ _(self):
self.output = None
self.normalizeNewlines = False
def dataReceived(self, data):
if self.normalizeNewlines:
data = re.sub(r"(\r\n|\n)", "\r\n", data)
if self.output:
self.output.write(data)
class StdioProxyProtocol(DataForwardingProtocol):
def connectionMade(self):
inputForwarder = DataForwardingProtocol( )
inputForwarder.output = self.transport
inputForwarder.normalizeNewlines = True
stdioWrapper = stdio.StandardIO(inputForwarder)
self.output = stdioWrapper
print "Connected to server. Press ctrl-C to close connection."
class StdioProxyFactory(protocol.ClientFactory):
protocol = StdioProxyProtocol
def clientConnectionLost(self, transport, reason):
reactor.stop( )
def clientConnectionFailed(self, transport, reason):
print reason.getErrorMessage( )
reactor.stop( )
if __name__ == '_ _main_ _':
import sys
if not len(sys.argv) == 3:
print "Usage: %s host port" % _ _file_ _
sys.exit(1)
reactor.connectTCP(sys.argv[1], int(sys.argv[2]), StdioProxyFactory( ))
reactor.run( )
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
protocol.dataReceived,您'是压倒性的,太低级了,如果没有你没有做的智能缓冲,就无法达到目的——根据我刚刚引用的文档,
您似乎完全忽略了文档的这一关键部分。
您可以改为使用 LineReceiver.lineReceived (当然,继承自protocols.basic.LineReceiver)以利用HTTP请求以“行”形式出现的事实——您仍然需要将正在发送的标头连接起来多行,因为本教程说:
一旦您有了格式良好/解析良好的响应(考虑研究 twisted.web 的来源 所以看看一种可行的方法),
现在是
Host
标头(cfr RFC< /a> 第 14.23 节)包含此信息。protocol.dataReceived, which you're overriding, is too low-level to serve for the purpose without smart buffering that you're not doing -- per the docs I just quoted,
You appear to be completely ignoring this crucial part of the docs.
You could instead use LineReceiver.lineReceived (inheriting from
protocols.basic.LineReceiver
, of course) to take advantage of the fact that HTTP requests come in "lines" -- you'll still need to join up headers that are being sent as multiple lines, since as this tutorial says:Once you have a nicely formatted/parsed response (consider studying twisted.web's sources so see one way it could be done),
now the
Host
header (cfr the RFC section 14.23) is the one containing this info.根据您似乎正在尝试的内容,我认为以下是阻力最小的路径:
http://twistedmatrix.com/documents/10.0.0/ api/twisted.web.proxy.html
这是用于构建 HTTP 代理的扭曲类。它可以让您拦截请求,查看目的地并查看发送者。您还可以查看所有标题和来回内容。您似乎正在尝试重写twisted 已经为您提供的HTTP 协议和代理类。我希望这有帮助。
Just based on what you seems to be attempting, I think the following would be the path of least resistance:
http://twistedmatrix.com/documents/10.0.0/api/twisted.web.proxy.html
That's the twisted class for building an HTTP Proxy. It will let you intercept the requests, look at the destination and look at the sender. You can also look at all the headers and the content going back and forth. You seem to be trying to re-write the HTTP Protocol and Proxy class that twisted has already provided for you. I hope this helps.