用不同的语言反序列化

发布于 2024-07-04 14:21:34 字数 147 浏览 4 评论 0原文

log4j 网络适配器将事件作为序列化的 java 对象发送。 我希望能够捕获这个对象并用不同的语言(python)反序列化它。 这可能吗?

注意网络捕获很容易; 它只是一个 TCP 套接字并在流中读取。 困难在于反序列化部分

The log4j network adapter sends events as a serialised java object. I would like to be able to capture this object and deserialise it in a different language (python). Is this possible?

NOTE The network capturing is easy; its just a TCP socket and reading in a stream. The difficulty is the deserialising part

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

假装爱人 2024-07-11 14:21:34

如果您可以在接收端有一个 JVM 以及序列化数据的类定义,并且您只想使用 Python 而不想使用其他语言,那么您可以使用 Jython:

  • 您将使用正确的 Java 方法反序列化您收到的内容
  • ,然后你用Python代码处理你得到的东西

If you can have a JVM on the receiving side and the class definitions for the serialized data, and you only want to use Python and no other language, then you may use Jython:

  • you would deserialize what you received using the correct Java methods
  • and then you process what you get with you Python code
一个人的夜不怕黑 2024-07-11 14:21:34

理论上是有可能的。 现在,实践中的困难程度取决于 Java 序列化格式是否有文档记录。 我想,事实并非如此。 编辑: 哎呀,我错了,谢谢查尔斯。

无论如何,这就是我建议你做的事情

  1. 从 log4j & 中捕获。 在您自己的小 Java 程序中反序列化 Java 对象。

  2. 现在,当您再次拥有该对象时,请使用您自己的自定义格式化程序对其进行序列化。

    提示:也许您甚至不必编写自己的自定义格式化程序。 例如, JSON(向下滚动查看库) 具有 Python 和 Java 库,因此理论上您可以使用用于序列化对象的 Java 库和用于反序列化它的 Python 等效库

  3. 将输出流发送到您的 Python 应用程序并反序列化它

查尔斯写道:

问题是对于这个
工作,你的“小java程序”
需要加载所有相同版本
可能的相同课程
反序列化。 如果你是这样的话,这很棘手
从一个应用程序接收日志消息,
如果你真的很棘手
多路复用多个日志流。
无论如何,这都不会是一个
不再是小程序了。

难道你不能在你自己的java进程中简单地引用Java log4j库吗? 我只是在这里提供适用于任何语言对的一般建议(问题的名称与语言无关,所以我只提供了一种通用解决方案)。 不管怎样,我对log4j不熟悉,也不知道你是否可以将自己的序列化器“注入”到其中。 如果可以的话,那么你的建议当然更好、更干净。

In theory it's possible. Now how difficult in practice it might be depends on whether Java serialization format is documented or not. I guess, it's not. edit: oops, I was wrong, thanks Charles.

Anyway, this is what I suggest you to do

  1. capture from log4j & deserialize Java object in your own little Java program.

  2. now when you have the object again, serialize it using your own custom formatter.

    Tip: Maybe you don't even have to write your own custom formatter. for example, JSON (scroll down for libs) has libraries for Python and Java, so you could in theory use Java library to serialize your objects and Python equivalent library to deserialize it

  3. send output stream to your python application and deserialize it

Charles wrote:

the problem is that for this
to work, your "little java program"
needs to load the same versions of all
the same classes that it might
deserialize. Which is tricky if you're
receiving log messages from one app,
and really tricky if you're
multiplexing more than one log stream.
Either way, it's not going to be a
little program any more.

Can't you just simply reference Java log4j libraries in your own java process? I'm just giving general advice here that is applicable to any pair of languages (name of the question is pretty language agnostic so I just provided one of the generic solutions). Anyway, I'm not familiar with log4j and don't know whether you can "inject" your own serializer into it. If you can, then of course your suggestion is much better and cleaner.

尸血腥色 2024-07-11 14:21:34

我建议转向两种语言都可以理解并且可以轻松编组/解组的第三方格式(通过创建您自己的 log4j 适配器等),例如 XML。

I would recommend moving to a third-party format (by creating your own log4j adapters etc) that both languages understand and can easily marshal / unmarshal, e.g. XML.

奈何桥上唱咆哮 2024-07-11 14:21:34

理论上,这是可能的。 Java 序列化与 Java 领域的几乎所有内容一样,都是标准化的。 因此,您可以根据该标准在 Python 中实现反序列化器。 然而,Java 序列化格式并不是为跨语言使用而设计的,序列化格式与 JVM 内对象的表示方式密切相关。 虽然在 Python 中实现 JVM 肯定是一项有趣的练习,但它可能不是您想要的(-:

还有其他专门设计为与语言无关的(数据)序列化格式。它们通常通过剥离数据格式来工作到最低限度(数字、字符串、序列、字典等等),因此需要在两端进行一些工作才能将丰富的对象表示为哑数据结构的图形(反之亦然)

JSON(JavaScript 对象表示法)YAML(YAML 不是标记语言)

ASN.1 (Abstract Syntax Notation One) 是另一种数据序列化格式。 ASN.1 是自描述的,这意味着解码流所需的所有信息都编码在流本身中。

当然,XML(可扩展标记语言) 将也可以工作,前提是它不仅用于提供 Java 对象的“内存转储”的文本表示,而且还用于提供实际的抽象、与语言无关的编码。

因此,长话短说:您最好的选择是尝试强制 log4j 以上述格式之一进行日志记录,将 log4j 替换为具有该功能的内容,或者尝试在对象通过网络发送之前以某种方式拦截这些对象。在离开爪哇岛之前将它们接线并转换。

实现 JSON、YAML、ASN.1 和 XML 的库可用于 Java 和 Python(以及几乎所有人类已知的编程语言)。

Theoretically, it's possible. The Java Serialization, like pretty much everything in Javaland, is standardized. So, you could implement a deserializer according to that standard in Python. However, the Java Serialization format is not designed for cross-language use, the serialization format is closely tied to the way objects are represented inside the JVM. While implementing a JVM in Python is surely a fun exercise, it's probably not what you're looking for (-:

There are other (data) serialization formats that are specifically designed to be language agnostic. They usually work by stripping the data formats down to the bare minimum (number, string, sequence, dictionary and that's it) and thus requiring a bit of work on both ends to represent a rich object as a graph of dumb data structures (and vice versa).

Two examples are JSON (JavaScript Object Notation) and YAML (YAML Ain't Markup Language).

ASN.1 (Abstract Syntax Notation One) is another data serialization format. Instead of dumbing the format down to a point where it can be easily understood, ASN.1 is self-describing, meaning all the information needed to decode a stream is encoded within the stream itself.

And, of course, XML (eXtensible Markup Language), will work too, provided that it is not just used to provide textual representation of a "memory dump" of a Java object, but an actual abstract, language-agnostic encoding.

So, to make a long story short: your best bet is to either try to coerce log4j into logging in one of the above-mentioned formats, replace log4j with something that does that or try to somehow intercept the objects before they are sent over the wire and convert them before leaving Javaland.

Libraries that implement JSON, YAML, ASN.1 and XML are available for both Java and Python (and pretty much every programming language known to man).

提笔落墨 2024-07-11 14:21:34

一般来说,不会。

Java 序列化的流格式在本文档中定义,但是您需要访问原始类定义(以及将它们加载到的 Java 运行时),以将流数据转换回接近原始对象的内容。 例如,类可以定义 writeObject() 和 readObject() 方法来定制它们自己的序列化形式。

编辑:lubos hasko建议使用一个小的java程序来反序列化Python前面的对象,但问题是要使其工作,你的“小java程序”需要加载相同版本的如果您从一个应用程序接收日志消息,那么这很棘手,如果您要多路复用多个日志流,则这确实很棘手,无论哪种方式,它都不再是一个小程序。 edit2:我在这里可能是错的,我不知道什么会被序列化。另一方面,如果只是 log4j 类,则可以记录任意异常。 自定义 log4j 网络适配器并将原始序列化替换为一些更容易反序列化的形式(例如,您可以使用 XStream 将对象转换为 XML 表示形式)

会更容易)

Generally, no.

The stream format for Java serialization is defined in this document, but you need access to the original class definitions (and a Java runtime to load them into) to turn the stream data back into something approaching the original objects. For example, classes may define writeObject() and readObject() methods to customise their own serialized form.

(edit: lubos hasko suggests having a little java program to deserialize the objects in front of Python, but the problem is that for this to work, your "little java program" needs to load the same versions of all the same classes that it might deserialize. Which is tricky if you're receiving log messages from one app, and really tricky if you're multiplexing more than one log stream. Either way, it's not going to be a little program any more. edit2: I could be wrong here, I don't know what gets serialized. If it's just log4j classes you should be fine. On the other hand, it's possible to log arbitrary exceptions, and if they get put in the stream as well my point stands.)

It would be much easier to customise the log4j network adapter and replace the raw serialization with some more easily-deserialized form (for example you could use XStream to turn the object into an XML representation)

兲鉂ぱ嘚淚 2024-07-11 14:21:34

好吧,我不是 Python 专家,所以我无法评论如何解决你的问题,但如果你有 .NET 程序,你可以使用 IKVM.NET 轻松反序列化 Java 对象。 我通过为写入 Socket 附加程序的 Log4J 日志消息创建 .NET 客户端进行了实验,效果非常好。

如果这个答案在这里没有意义,我很抱歉。

Well I am not Python expert so I can't comment on how to solve your problem but if you have program in .NET you may use IKVM.NET to deserialize Java objects easily. I have experimented this by creating .NET Client for Log4J log messages written to Socket appender and it worked really well.

I am sorry, if this answer does not make sense here.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文