为什么从 Paste 应用程序中创建 neo4j.GraphDatabase 会导致段错误?

发布于 2024-12-12 19:31:13 字数 2315 浏览 0 评论 0原文

以下代码会导致 Java 出现段错误:

import os.path
import neo4j
from paste import httpserver, fileapp
import tempfile
from webob.dec import wsgify
from webob import Response, Request

HOST = '127.0.0.1'
PORT = 8080

class DebugApp(object):
    @wsgify
    def __call__(self, req):

        # db = neo4j.GraphDatabase(tempfile.mkdtemp())
        db = neo4j.GraphDatabase(os.path.join(os.path.dirname(os.path.abspath(__file__)), 'data'))
        return Response(body='it worked')

def main():
    app = DebugApp()
    httpserver.serve(app, host=HOST, port=PORT)

if __name__ == '__main__':
    main()

要重现,请首先将该代码保存到文件中(例如 app.py),然后运行 ​​python app.py。然后在浏览器中尝试 http://localhost:8080 ;您应该看到 Java 崩溃处理程序。

Java 堆栈跟踪的顶部如下所示:

Stack: [0xb42e7000,0xb4ae8000],  sp=0xb4ae44f0,  free space=8181k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [_jpype.so+0x26497]  JPJavaEnv::NewObjectA(_jclass*, _jmethodID*, jvalue*)+0x37
C  [_jpype.so+0x3c0e8]  JPMethodOverload::invokeConstructor(_jclass*, std::vector<HostRef*, std::allocator<HostRef*> >&)+0x178
C  [_jpype.so+0x3a417]  JPMethod::invokeConstructor(std::vector<HostRef*, std::allocator<HostRef*> >&)+0x47
C  [_jpype.so+0x1beba]  JPClass::newInstance(std::vector<HostRef*, std::allocator<HostRef*> >&)+0x2a
C  [_jpype.so+0x67b9c]  PyJPClass::newClassInstance(_object*, _object*)+0xfc
C  [python+0x96822]  PyEval_EvalFrameEx+0x4332
C  [python+0x991e7]  PyEval_EvalCodeEx+0x127

我相信 Python 中的 Neo4j.GraphDatabase 触发 JPype 在 Java 下的 Neo4j 中查找 EmbeddedGraphDatabase 。

在交互式 Python 会话中运行此代码不会出现段错误:

>>> import webob
>>> import app
>>> debug_app = app.DebugApp()
>>> response = debug_app(webob.Request.blank('/'))
>>> response.body
'it worked'

大概这是因为我在该示例中完全避免了粘贴。也许这与 Paste 使用线程妨碍 Neo4j 有关?我在 neo4j 论坛中注意到了一个有点类似的问题: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-CPython-Pylons-and-threading-td942435.html

...但这仅在关闭时发生。

The following code causes Java to segfault:

import os.path
import neo4j
from paste import httpserver, fileapp
import tempfile
from webob.dec import wsgify
from webob import Response, Request

HOST = '127.0.0.1'
PORT = 8080

class DebugApp(object):
    @wsgify
    def __call__(self, req):

        # db = neo4j.GraphDatabase(tempfile.mkdtemp())
        db = neo4j.GraphDatabase(os.path.join(os.path.dirname(os.path.abspath(__file__)), 'data'))
        return Response(body='it worked')

def main():
    app = DebugApp()
    httpserver.serve(app, host=HOST, port=PORT)

if __name__ == '__main__':
    main()

To reproduce, first save that code into a file (say, app.py), and then run python app.py. Then try http://localhost:8080 in your browser; you should see the Java crash handler.

The top of the Java stack trace looks like this:

Stack: [0xb42e7000,0xb4ae8000],  sp=0xb4ae44f0,  free space=8181k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [_jpype.so+0x26497]  JPJavaEnv::NewObjectA(_jclass*, _jmethodID*, jvalue*)+0x37
C  [_jpype.so+0x3c0e8]  JPMethodOverload::invokeConstructor(_jclass*, std::vector<HostRef*, std::allocator<HostRef*> >&)+0x178
C  [_jpype.so+0x3a417]  JPMethod::invokeConstructor(std::vector<HostRef*, std::allocator<HostRef*> >&)+0x47
C  [_jpype.so+0x1beba]  JPClass::newInstance(std::vector<HostRef*, std::allocator<HostRef*> >&)+0x2a
C  [_jpype.so+0x67b9c]  PyJPClass::newClassInstance(_object*, _object*)+0xfc
C  [python+0x96822]  PyEval_EvalFrameEx+0x4332
C  [python+0x991e7]  PyEval_EvalCodeEx+0x127

I believe that's neo4j.GraphDatabase in Python triggering JPype to go looking for EmbeddedGraphDatabase in neo4j, under Java.

Running this code in an interactive Python session doesn't segfault:

>>> import webob
>>> import app
>>> debug_app = app.DebugApp()
>>> response = debug_app(webob.Request.blank('/'))
>>> response.body
'it worked'

Presumably that's because I'm avoiding Paste altogether in that example. Perhaps this has something to do with Paste's use of threads getting in the way of neo4j? I noted a somewhat similar problem in the neo4j forums: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-CPython-Pylons-and-threading-td942435.html

...but that only occurs on shutdown.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

独﹏钓一江月 2024-12-19 19:31:13

问题不在于 Paste 本身,而在于 Neo4j Python 绑定,它使用 JPype。 Paste 创建线程来处理传入的请求; neo4j 应该是线程安全的,但是 JPype 在文档中附带了这个警告(1):

“在大多数情况下,基于操作系统级别线程(即 posix 线程)的 python 线程将毫无问题地工作。唯一要记住的是在线程体中调用 jpype.attachThreadToJVM() 以使 JVM 可以从对于您自己不启动的线程,您可以调用 isThreadAttachedToJVM() 来检查。”

我找不到执行此操作的代码,但我认为 neo4j 绑定中的某些 Java 代码可能会在导入时调用 attachThreadToJVM 。如果是这样,当请求通过粘贴传递给工作线程,然后该线程从 Neo4j 获取数据时,它就跨越了线程边界,并且可能不满足 JVM 附加规则。

您可以通过仅在单个线程中运行 import neo4j 来避免崩溃。在上面的例子中,这是threading.Thread所针对的可调用函数。

不幸的是,这意味着尽管 Neo4j 是线程安全的,但在 Python 中使用时它必须限制为单个线程。但考虑到这一点,这并不算太令人失望。

更新:维护者回应(2) 并调查了问题,并检查了修复。我不知道 Neo4j 的哪个版本可用,并且我无法再找到对其 github 存储库的提交( 3),所以这代表重新测试。

The issue is not with Paste per se, but with the neo4j Python bindings, which use JPype. Paste creates threads to handle incoming requests; neo4j is supposed to be thread-safe, but JPype comes with this caveat from the documentation (1):

"For the most part, python threads based on OS level threads (i.e posix threads), will work without problem. The only thing to remember is to call jpype.attachThreadToJVM() in the thread body to make the JVM usable from that thread. For threads that you do not start yourself, you can call isThreadAttachedToJVM() to check."

I couldn't find the code that does this, but I think that some of the Java code in the neo4j bindings may call attachThreadToJVM at import time. If so, when a request is handed to a worker thread by paste, and that thread then goes to fetch data from neo4j, it is crossing thread boundaries, and the JVM attachment rule may not be satisfied.

You can avoid the crash by only running import neo4j from within a single thread. In the case above, this is the callable targeted by threading.Thread.

Unfortunately, this means that even though neo4j is thread-safe, it must be constrained to a single thread when used from Python. But that's not too disappointing, considering.

Update: the maintainers responded(2) and investigated the problem, and checked in a fix. I don't know which release of neo4j this was available in, and I can no longer find the commit to their github repo(3), so this stands for re-testing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文