Python SimpleXMLRPCServer 的 Unicode/XML 无效?

发布于 2024-10-06 01:32:04 字数 2232 浏览 7 评论 0原文

当我将无效的 XML 字符传递给 Python SimpleXMLRPCServer 时,我在客户端收到以下错误:

Fault: <Fault 1: "<class 'xml.parsers.expat.ExpatError'>:not well-formed (invalid token): line 6, column 15">

为什么?我是否必须更改 SimpleXMLRPCServer 库代码才能解决此问题?

这是我的 XML-RPC 服务器代码:

from SimpleXMLRPCServer import SimpleXMLRPCServer

import logging
logging.basicConfig(level=logging.DEBUG)

def tt(text):
    return "cool"

server = SimpleXMLRPCServer(("0.0.0.0", 9000))
server.register_introspection_functions()
server.register_function(tt)

# Run the server's main loop
server.serve_forever()

这是我的 XML-RPC 客户端代码:

s = xmlrpclib.ServerProxy('http://localhost:9000')
s.tt(unichr(0x8))

在服务器端,我没有收到任何错误或回溯:

liXXXXXX.members.linode.com - - [06/Dec/2010 23:19:40] "POST /RPC2 HTTP/1.0" 200 -

为什么服务器端没有错误?我如何诊断发生了什么?

我在客户端得到以下回溯:

/usr/lib/python2.6/xmlrpclib.pyc in __call__(self, *args)
   1197         return _Method(self.__send, "%s.%s" % (self.__name, name))
   1198     def __call__(self, *args):
-> 1199         return self.__send(self.__name, args)
   1200 
   1201 ##


/usr/lib/python2.6/xmlrpclib.pyc in __request(self, methodname, params)
   1487             self.__handler,
   1488             request,
-> 1489             verbose=self.__verbose
   1490             )
   1491 

/usr/lib/python2.6/xmlrpclib.pyc in request(self, host, handler, request_body, verbose)
   1251             sock = None
   1252 
-> 1253         return self._parse_response(h.getfile(), sock)
   1254 
   1255     ##


/usr/lib/python2.6/xmlrpclib.pyc in _parse_response(self, file, sock)
   1390         p.close()
   1391 
-> 1392         return u.close()
   1393 
   1394 ##


/usr/lib/python2.6/xmlrpclib.pyc in close(self)
    836             raise ResponseError()
    837         if self._type == "fault":
--> 838             raise Fault(**self._stack[0])
    839         return tuple(self._stack)
    840 

Fault: <Fault 1: "<class 'xml.parsers.expat.ExpatError'>:not well-formed (invalid token): line 6, column 15">

如果输入包含无效的 XML,如何获得合理的服务器端处理? 我可以清理服务器端的数据吗?如何?

I am getting the following error on the client side when I pass invalid XML characters to a Python SimpleXMLRPCServer:

Fault: <Fault 1: "<class 'xml.parsers.expat.ExpatError'>:not well-formed (invalid token): line 6, column 15">

Why? Do I have to change the SimpleXMLRPCServer library code to fix this?

Here is my XML-RPC server code:

from SimpleXMLRPCServer import SimpleXMLRPCServer

import logging
logging.basicConfig(level=logging.DEBUG)

def tt(text):
    return "cool"

server = SimpleXMLRPCServer(("0.0.0.0", 9000))
server.register_introspection_functions()
server.register_function(tt)

# Run the server's main loop
server.serve_forever()

Here is my XML-RPC client code:

s = xmlrpclib.ServerProxy('http://localhost:9000')
s.tt(unichr(0x8))

On the server side, I don't get ANY error or traceback:

liXXXXXX.members.linode.com - - [06/Dec/2010 23:19:40] "POST /RPC2 HTTP/1.0" 200 -

Why no error on the server side? How do I diagnose what is going on?

And I get the following traceback on the client side:

/usr/lib/python2.6/xmlrpclib.pyc in __call__(self, *args)
   1197         return _Method(self.__send, "%s.%s" % (self.__name, name))
   1198     def __call__(self, *args):
-> 1199         return self.__send(self.__name, args)
   1200 
   1201 ##


/usr/lib/python2.6/xmlrpclib.pyc in __request(self, methodname, params)
   1487             self.__handler,
   1488             request,
-> 1489             verbose=self.__verbose
   1490             )
   1491 

/usr/lib/python2.6/xmlrpclib.pyc in request(self, host, handler, request_body, verbose)
   1251             sock = None
   1252 
-> 1253         return self._parse_response(h.getfile(), sock)
   1254 
   1255     ##


/usr/lib/python2.6/xmlrpclib.pyc in _parse_response(self, file, sock)
   1390         p.close()
   1391 
-> 1392         return u.close()
   1393 
   1394 ##


/usr/lib/python2.6/xmlrpclib.pyc in close(self)
    836             raise ResponseError()
    837         if self._type == "fault":
--> 838             raise Fault(**self._stack[0])
    839         return tuple(self._stack)
    840 

Fault: <Fault 1: "<class 'xml.parsers.expat.ExpatError'>:not well-formed (invalid token): line 6, column 15">

How do I get sane server-side processing if the input contains invalid XML?
Can I clean up this data server side? How?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

恍梦境° 2024-10-13 01:32:04

首先,你的例子也不适合我。我不知道你在问“如果输入包含无效的 XML,则进行正常的服务器端处理”——你向服务器发送了无效的 XML,它给你返回了一个错误......你还想要什么?

其次,在 tt 中粘贴 print 'hi There',当您发送 unichr( 0x8)。服务器的确切响应(200)是:

HTTP/1.0 200 OK
Server: BaseHTTP/0.3 Python/2.6.5
Date: Tue, 07 Dec 2010 07:33:09 GMT
Content-type: text/xml
Content-length: 350

<?xml version='1.0'?>
<methodResponse>
<fault>
<value><struct>
<member>
<name>faultCode</name>
<value><int>1</int></value>
</member>
<member>
<name>faultString</name>
<value><string><class 'xml.parsers.expat.ExpatError'>:not well-formed (invalid token): line 6, column 15</string></value>
</member>
</struct></value>
</fault>
</methodResponse>

所以,您会看到错误消息。

现在,根据 XML-RPC 规范

  • 字符串中允许使用哪些字符?不可打印的字符?空字符? “字符串”可以用来保存任意的二进制数据块吗?

字符串中允许使用任何字符,除了 <<和 &,编码为 <和&。字符串可用于对二进制数据进行编码。

好的,但这是 XML,根据 XML 规范

合法字符包括制表符、回车符、换行符以及 Unicode 和 ISO/IEC 10646 的合法字符。

字符 ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

其中不包括 0x08,并且似乎完全违背 XML-RPC 规范!因此,它会看到您的 XML 解析器相当严格地实现了 XML 规范(从错误来看,它看起来是外籍的)。由于 XML 不允许 0x08,因此您无法发送 0x08,实际上,您会收到错误。

如果我们这样做:

data = "<?xml version='1.0'?>\n<methodCall>\n<methodName>tt</methodName>\n<params>\n<param>\n<value><string>\x08</string></value>\n</param>\n</params>\n</methodCall>"
p = xml.parsers.expat.ParserCreate()
p.Parse(data, True)

...我们会收到您的错误。同样,您将垃圾 XML 传递给服务器,服务器向您传回一条错误消息,而中间的 Python 则将该错误作为异常呈现给您。您期望什么行为?

First, your example doesn't work for me, either. I don't know what you're asking about "sane server-side processing if the input contains invalid XML" -- you send the server invalid XML, and it is giving you back an error... what more do you want?

Second, stick a print 'hi there' in tt, you will see that tt is not being called when you send unichr(0x8). The exact response (a 200) by the server is:

HTTP/1.0 200 OK
Server: BaseHTTP/0.3 Python/2.6.5
Date: Tue, 07 Dec 2010 07:33:09 GMT
Content-type: text/xml
Content-length: 350

<?xml version='1.0'?>
<methodResponse>
<fault>
<value><struct>
<member>
<name>faultCode</name>
<value><int>1</int></value>
</member>
<member>
<name>faultString</name>
<value><string><class 'xml.parsers.expat.ExpatError'>:not well-formed (invalid token): line 6, column 15</string></value>
</member>
</struct></value>
</fault>
</methodResponse>

So, you see your error message.

Now, according to the XML-RPC spec,

  • What characters are allowed in strings? Non-printable characters? Null characters? Can a "string" be used to hold an arbitrary chunk of binary data?

Any characters are allowed in a string except < and &, which are encoded as < and &. A string can be used to encode binary data.

Ok, but this is XML, and according to the XML spec:

Legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646.

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

Which doesn't include 0x08, and seems to completely contradict the XML-RPC spec! So, it would see that the XML spec is being implemented fairly rigorously by your XML parser (which, judging from the error, looks to be expat). Since XML doesn't allow 0x08, you can't send 0x08, and indeed, you get an error back.

If we do:

data = "<?xml version='1.0'?>\n<methodCall>\n<methodName>tt</methodName>\n<params>\n<param>\n<value><string>\x08</string></value>\n</param>\n</params>\n</methodCall>"
p = xml.parsers.expat.ParserCreate()
p.Parse(data, True)

...we get your error. Again, you are passing garbage XML to the server, the server is passing you back an error message, and the Python in the middle is presenting that error to you as an exception. What behavior did you expect?

白首有我共你 2024-10-13 01:32:04

您在评论中表示您希望为客户端处理尽可能多的 XML。虽然乍一看这可能听起来不错(?),但有一些缺点需要考虑:

  • 你怎么知道你可以剥离什么?也许您删除了一些本来很重要的内容,但客户端发送的代码编码很糟糕,等等。

  • 想象一下,最初您支持具有一种特定畸形的请求。但随后用户开始向您发送第二种类型的畸形,您也为该类型添加例外(一旦您为第一种添加了例外,为什么不呢?)。这是一条漫长的路...

  • 最好是让事情尽快失败,并让其在该处理的地方处理。这次客户端实现是错误的,所以让客户端修复它。从长远来看,对你们双方都更好。

如果您也管理客户端代码,那么您可能会采取最后的手段在其上推送一些 XML 整洁(请参阅 BeautifulSoup例如)。而是首先通过禁用无效输入来解决问题。

You indicated in your comment that you would like to handle as much of the XML for the client as possible. While this may sound good on first sight (?), there are cons to consider:

  • How do you know what can you strip? Maybe you strip something that would have been important, but the client send it badly coded, etc.

  • Imagine that initially you support request with one particular malformation. But then users start to send you a second type malformation, and you add exception for that one too (once you added for the first one, why not?). This is a long way down the road...

  • It is better to let things fail as soon as possible and let them be dealt with where it is should be. This time the client implementation is wrong, so let the client fix it. Better for both of you on the long run.

If you manage the client code too, then you may last-resort to pushing some XML tidy on it (see BeautifulSoup for example). But rather deal with the problem by disabling invalid input in the first place.

窝囊感情。 2024-10-13 01:32:04

Thanatos 在他的帖子中完美地解释了您的问题的原因。

至于解决此问题的解决方案:您可以使用 xmlrpclib.Binary 对要发送的数据进行 Base64 编码。 (对于 PY3K: xmlrpc.客户端.Binary)

Thanatos perfectly explained the reason of your problem in his post.

As for a solution to workaround this problem: You can use xmlrpclib.Binary to base64-encode the data to be sent. (For PY3K: xmlrpc.client.Binary)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文